All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms
@ 2021-07-22  7:39 ZheNing Hu via GitGitGadget
  2021-07-22  7:39 ` [PATCH 1/5] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
                   ` (6 more replies)
  0 siblings, 7 replies; 45+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-07-22  7:39 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
	Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, ZheNing Hu

This patch series is split from my main patch series zh/ref-filter-raw-data,
to make things easier for reviewers to read.

This patch series provided %(raw) and %(rest) for ref-filter, which will be
used by zh/cat-file-reuse-ref-filter-logic later.

ZheNing Hu (5):
  [GSOC] ref-filter: add obj-type check in grab contents
  [GSOC] ref-filter: add %(raw) atom
  [GSOC] ref-filter: --format=%(raw) re-support --perl
  [GSOC] ref-filter: use non-const ref_format in *_atom_parser()
  [GSOC] ref-filter: add %(rest) atom

 Documentation/git-for-each-ref.txt |   9 ++
 builtin/tag.c                      |   2 +-
 quote.c                            |  17 ++
 quote.h                            |   1 +
 ref-filter.c                       | 240 ++++++++++++++++++++++-------
 ref-filter.h                       |   9 +-
 t/t3203-branch-output.sh           |   4 +
 t/t6300-for-each-ref.sh            | 235 ++++++++++++++++++++++++++++
 t/t7004-tag.sh                     |   4 +
 t/t7030-verify-tag.sh              |   4 +
 10 files changed, 463 insertions(+), 62 deletions(-)


base-commit: daab8a564f8bbac55f70f8bf86c070e001a9b006
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1000%2Fadlternative%2Fref-filter-raw-data-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1000/adlternative/ref-filter-raw-data-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1000
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 1/5] [GSOC] ref-filter: add obj-type check in grab contents
  2021-07-22  7:39 [PATCH 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms ZheNing Hu via GitGitGadget
@ 2021-07-22  7:39 ` ZheNing Hu via GitGitGadget
  2021-07-22  8:51   ` Christian Couder
  2021-07-22  7:39 ` [PATCH 2/5] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 45+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-07-22  7:39 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
	Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Only tag and commit objects use `grab_sub_body_contents()` to grab
object contents in the current codebase.  We want to teach the
function to also handle blobs and trees to get their raw data,
without parsing a blob (whose contents looks like a commit or a tag)
incorrectly as a commit or a tag.

Skip the block of code that is specific to handling commits and tags
early when the given object is of a wrong type to help later
addition to handle other types of objects in this function.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 ref-filter.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 4db0e40ff4c..5cee6512fba 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1356,11 +1356,12 @@ static void append_lines(struct strbuf *out, const char *buf, unsigned long size
 }
 
 /* See grab_values */
-static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
+static void grab_sub_body_contents(struct atom_value *val, int deref, struct expand_data *data)
 {
 	int i;
 	const char *subpos = NULL, *bodypos = NULL, *sigpos = NULL;
 	size_t sublen = 0, bodylen = 0, nonsiglen = 0, siglen = 0;
+	void *buf = data->content;
 
 	for (i = 0; i < used_atom_cnt; i++) {
 		struct used_atom *atom = &used_atom[i];
@@ -1371,10 +1372,13 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
 			continue;
 		if (deref)
 			name++;
-		if (strcmp(name, "body") &&
-		    !starts_with(name, "subject") &&
-		    !starts_with(name, "trailers") &&
-		    !starts_with(name, "contents"))
+
+		if ((data->type != OBJ_TAG &&
+		     data->type != OBJ_COMMIT) ||
+		    (strcmp(name, "body") &&
+		     !starts_with(name, "subject") &&
+		     !starts_with(name, "trailers") &&
+		     !starts_with(name, "contents")))
 			continue;
 		if (!subpos)
 			find_subpos(buf,
@@ -1438,17 +1442,19 @@ static void fill_missing_values(struct atom_value *val)
  * pointed at by the ref itself; otherwise it is the object the
  * ref (which is a tag) refers to.
  */
-static void grab_values(struct atom_value *val, int deref, struct object *obj, void *buf)
+static void grab_values(struct atom_value *val, int deref, struct object *obj, struct expand_data *data)
 {
+	void *buf = data->content;
+
 	switch (obj->type) {
 	case OBJ_TAG:
 		grab_tag_values(val, deref, obj);
-		grab_sub_body_contents(val, deref, buf);
+		grab_sub_body_contents(val, deref, data);
 		grab_person("tagger", val, deref, buf);
 		break;
 	case OBJ_COMMIT:
 		grab_commit_values(val, deref, obj);
-		grab_sub_body_contents(val, deref, buf);
+		grab_sub_body_contents(val, deref, data);
 		grab_person("author", val, deref, buf);
 		grab_person("committer", val, deref, buf);
 		break;
@@ -1678,7 +1684,7 @@ static int get_object(struct ref_array_item *ref, int deref, struct object **obj
 			return strbuf_addf_ret(err, -1, _("parse_object_buffer failed on %s for %s"),
 					       oid_to_hex(&oi->oid), ref->refname);
 		}
-		grab_values(ref->value, deref, *obj, oi->content);
+		grab_values(ref->value, deref, *obj, oi);
 	}
 
 	grab_common_values(ref->value, deref, oi);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 2/5] [GSOC] ref-filter: add %(raw) atom
  2021-07-22  7:39 [PATCH 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms ZheNing Hu via GitGitGadget
  2021-07-22  7:39 ` [PATCH 1/5] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
@ 2021-07-22  7:39 ` ZheNing Hu via GitGitGadget
  2021-07-22  8:06   ` Ævar Arnfjörð Bjarmason
  2021-07-22  7:39 ` [PATCH 3/5] [GSOC] ref-filter: --format=%(raw) re-support --perl ZheNing Hu via GitGitGadget
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 45+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-07-22  7:39 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
	Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Add new formatting option `%(raw)`, which will print the raw
object data without any changes. It will help further to migrate
all cat-file formatting logic from cat-file to ref-filter.

The raw data of blob, tree objects may contain '\0', but most of
the logic in `ref-filter` depends on the output of the atom being
text (specifically, no embedded NULs in it).

E.g. `quote_formatting()` use `strbuf_addstr()` or `*._quote_buf()`
add the data to the buffer. The raw data of a tree object is
`100644 one\0...`, only the `100644 one` will be added to the buffer,
which is incorrect.

Therefore, we need to find a way to record the length of the
atom_value's member `s`. Although strbuf can already record the
string and its length, if we want to replace the type of atom_value's
member `s` with strbuf, many places in ref-filter that are filled
with dynamically allocated mermory in `v->s` are not easy to replace.
At the same time, we need to check if `v->s == NULL` in
populate_value(), and strbuf cannot easily distinguish NULL and empty
strings, but c-style "const char *" can do it. So add a new member in
`struct atom_value`: `s_size`, which can record raw object size, it
can help us add raw object data to the buffer or compare two buffers
which contain raw object data.

Note that `--format=%(raw)` cannot be used with `--python`, `--shell`,
`--tcl`, and `--perl` because if the binary raw data is passed to a
variable in such languages, these may not support arbitrary binary data
in their string variable type.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Helped-by: Bagas Sanjaya <bagasdotme@gmail.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Felipe Contreras <felipe.contreras@gmail.com>
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Based-on-patch-by: Olga Telezhnaya <olyatelezhnaya@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 Documentation/git-for-each-ref.txt |   9 ++
 ref-filter.c                       | 140 +++++++++++++++----
 t/t6300-for-each-ref.sh            | 216 +++++++++++++++++++++++++++++
 3 files changed, 338 insertions(+), 27 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index 2ae2478de70..cbb6f87d13f 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -235,6 +235,15 @@ and `date` to extract the named component.  For email fields (`authoremail`,
 without angle brackets, and `:localpart` to get the part before the `@` symbol
 out of the trimmed email.
 
+The raw data in an object is `raw`.
+
+raw:size::
+	The raw data size of the object.
+
+Note that `--format=%(raw)` can not be used with `--python`, `--shell`, `--tcl`,
+`--perl` because such language may not support arbitrary binary data in their
+string variable type.
+
 The message in a commit or a tag object is `contents`, from which
 `contents:<part>` can be used to extract various parts out of:
 
diff --git a/ref-filter.c b/ref-filter.c
index 5cee6512fba..506fbc3d691 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -144,6 +144,7 @@ enum atom_type {
 	ATOM_BODY,
 	ATOM_TRAILERS,
 	ATOM_CONTENTS,
+	ATOM_RAW,
 	ATOM_UPSTREAM,
 	ATOM_PUSH,
 	ATOM_SYMREF,
@@ -189,6 +190,9 @@ static struct used_atom {
 			struct process_trailer_options trailer_opts;
 			unsigned int nlines;
 		} contents;
+		struct {
+			enum { RAW_BARE, RAW_LENGTH } option;
+		} raw_data;
 		struct {
 			cmp_status cmp_status;
 			const char *str;
@@ -426,6 +430,18 @@ static int contents_atom_parser(const struct ref_format *format, struct used_ato
 	return 0;
 }
 
+static int raw_atom_parser(const struct ref_format *format, struct used_atom *atom,
+				const char *arg, struct strbuf *err)
+{
+	if (!arg)
+		atom->u.raw_data.option = RAW_BARE;
+	else if (!strcmp(arg, "size"))
+		atom->u.raw_data.option = RAW_LENGTH;
+	else
+		return strbuf_addf_ret(err, -1, _("unrecognized %%(raw) argument: %s"), arg);
+	return 0;
+}
+
 static int oid_atom_parser(const struct ref_format *format, struct used_atom *atom,
 			   const char *arg, struct strbuf *err)
 {
@@ -586,6 +602,7 @@ static struct {
 	[ATOM_BODY] = { "body", SOURCE_OBJ, FIELD_STR, body_atom_parser },
 	[ATOM_TRAILERS] = { "trailers", SOURCE_OBJ, FIELD_STR, trailers_atom_parser },
 	[ATOM_CONTENTS] = { "contents", SOURCE_OBJ, FIELD_STR, contents_atom_parser },
+	[ATOM_RAW] = { "raw", SOURCE_OBJ, FIELD_STR, raw_atom_parser },
 	[ATOM_UPSTREAM] = { "upstream", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
 	[ATOM_PUSH] = { "push", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
 	[ATOM_SYMREF] = { "symref", SOURCE_NONE, FIELD_STR, refname_atom_parser },
@@ -620,12 +637,15 @@ struct ref_formatting_state {
 
 struct atom_value {
 	const char *s;
+	size_t s_size;
 	int (*handler)(struct atom_value *atomv, struct ref_formatting_state *state,
 		       struct strbuf *err);
 	uintmax_t value; /* used for sorting when not FIELD_STR */
 	struct used_atom *atom;
 };
 
+#define ATOM_VALUE_S_SIZE_INIT (-1)
+
 /*
  * Used to parse format string and sort specifiers
  */
@@ -644,13 +664,6 @@ static int parse_ref_filter_atom(const struct ref_format *format,
 		return strbuf_addf_ret(err, -1, _("malformed field name: %.*s"),
 				       (int)(ep-atom), atom);
 
-	/* Do we have the atom already used elsewhere? */
-	for (i = 0; i < used_atom_cnt; i++) {
-		int len = strlen(used_atom[i].name);
-		if (len == ep - atom && !memcmp(used_atom[i].name, atom, len))
-			return i;
-	}
-
 	/*
 	 * If the atom name has a colon, strip it and everything after
 	 * it off - it specifies the format for this entry, and
@@ -660,6 +673,13 @@ static int parse_ref_filter_atom(const struct ref_format *format,
 	arg = memchr(sp, ':', ep - sp);
 	atom_len = (arg ? arg : ep) - sp;
 
+	/* Do we have the atom already used elsewhere? */
+	for (i = 0; i < used_atom_cnt; i++) {
+		int len = strlen(used_atom[i].name);
+		if (len == ep - atom && !memcmp(used_atom[i].name, atom, len))
+			return i;
+	}
+
 	/* Is the atom a valid one? */
 	for (i = 0; i < ARRAY_SIZE(valid_atom); i++) {
 		int len = strlen(valid_atom[i].name);
@@ -709,11 +729,14 @@ static int parse_ref_filter_atom(const struct ref_format *format,
 	return at;
 }
 
-static void quote_formatting(struct strbuf *s, const char *str, int quote_style)
+static void quote_formatting(struct strbuf *s, const char *str, size_t len, int quote_style)
 {
 	switch (quote_style) {
 	case QUOTE_NONE:
-		strbuf_addstr(s, str);
+		if (len != ATOM_VALUE_S_SIZE_INIT)
+			strbuf_add(s, str, len);
+		else
+			strbuf_addstr(s, str);
 		break;
 	case QUOTE_SHELL:
 		sq_quote_buf(s, str);
@@ -740,9 +763,12 @@ static int append_atom(struct atom_value *v, struct ref_formatting_state *state,
 	 * encountered.
 	 */
 	if (!state->stack->prev)
-		quote_formatting(&state->stack->output, v->s, state->quote_style);
+		quote_formatting(&state->stack->output, v->s, v->s_size, state->quote_style);
 	else
-		strbuf_addstr(&state->stack->output, v->s);
+		if (v->s_size != ATOM_VALUE_S_SIZE_INIT)
+			strbuf_add(&state->stack->output, v->s, v->s_size);
+		else
+			strbuf_addstr(&state->stack->output, v->s);
 	return 0;
 }
 
@@ -842,21 +868,23 @@ static int if_atom_handler(struct atom_value *atomv, struct ref_formatting_state
 	return 0;
 }
 
-static int is_empty(const char *s)
+static int is_empty(struct strbuf *buf)
 {
-	while (*s != '\0') {
-		if (!isspace(*s))
-			return 0;
-		s++;
-	}
-	return 1;
-}
+	const char *cur = buf->buf;
+	const char *end = buf->buf + buf->len;
+
+	while (cur != end && (isspace(*cur)))
+		cur++;
+
+	return cur == end;
+ }
 
 static int then_atom_handler(struct atom_value *atomv, struct ref_formatting_state *state,
 			     struct strbuf *err)
 {
 	struct ref_formatting_stack *cur = state->stack;
 	struct if_then_else *if_then_else = NULL;
+	size_t str_len = 0;
 
 	if (cur->at_end == if_then_else_handler)
 		if_then_else = (struct if_then_else *)cur->at_end_data;
@@ -867,18 +895,22 @@ static int then_atom_handler(struct atom_value *atomv, struct ref_formatting_sta
 	if (if_then_else->else_atom_seen)
 		return strbuf_addf_ret(err, -1, _("format: %%(then) atom used after %%(else)"));
 	if_then_else->then_atom_seen = 1;
+	if (if_then_else->str)
+		str_len = strlen(if_then_else->str);
 	/*
 	 * If the 'equals' or 'notequals' attribute is used then
 	 * perform the required comparison. If not, only non-empty
 	 * strings satisfy the 'if' condition.
 	 */
 	if (if_then_else->cmp_status == COMPARE_EQUAL) {
-		if (!strcmp(if_then_else->str, cur->output.buf))
+		if (str_len == cur->output.len &&
+		    !memcmp(if_then_else->str, cur->output.buf, cur->output.len))
 			if_then_else->condition_satisfied = 1;
 	} else if (if_then_else->cmp_status == COMPARE_UNEQUAL) {
-		if (strcmp(if_then_else->str, cur->output.buf))
+		if (str_len != cur->output.len ||
+		    memcmp(if_then_else->str, cur->output.buf, cur->output.len))
 			if_then_else->condition_satisfied = 1;
-	} else if (cur->output.len && !is_empty(cur->output.buf))
+	} else if (cur->output.len && !is_empty(&cur->output))
 		if_then_else->condition_satisfied = 1;
 	strbuf_reset(&cur->output);
 	return 0;
@@ -924,7 +956,7 @@ static int end_atom_handler(struct atom_value *atomv, struct ref_formatting_stat
 	 * only on the topmost supporting atom.
 	 */
 	if (!current->prev->prev) {
-		quote_formatting(&s, current->output.buf, state->quote_style);
+		quote_formatting(&s, current->output.buf, current->output.len, state->quote_style);
 		strbuf_swap(&current->output, &s);
 	}
 	strbuf_release(&s);
@@ -974,6 +1006,10 @@ int verify_ref_format(struct ref_format *format)
 		at = parse_ref_filter_atom(format, sp + 2, ep, &err);
 		if (at < 0)
 			die("%s", err.buf);
+		if (format->quote_style && used_atom[at].atom_type == ATOM_RAW &&
+		    used_atom[at].u.raw_data.option == RAW_BARE)
+			die(_("--format=%.*s cannot be used with"
+			      "--python, --shell, --tcl, --perl"), (int)(ep - sp - 2), sp + 2);
 		cp = ep + 1;
 
 		if (skip_prefix(used_atom[at].name, "color:", &color))
@@ -1367,12 +1403,25 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, struct exp
 		struct used_atom *atom = &used_atom[i];
 		const char *name = atom->name;
 		struct atom_value *v = &val[i];
+		enum atom_type atom_type = atom->atom_type;
 
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
 
+		if (atom_type == ATOM_RAW) {
+			unsigned long buf_size = data->size;
+
+			if (atom->u.raw_data.option == RAW_BARE) {
+				v->s = xmemdupz(buf, buf_size);
+				v->s_size = buf_size;
+			} else if (atom->u.raw_data.option == RAW_LENGTH) {
+				v->s = xstrfmt("%"PRIuMAX, (uintmax_t)buf_size);
+			}
+			continue;
+		}
+
 		if ((data->type != OBJ_TAG &&
 		     data->type != OBJ_COMMIT) ||
 		    (strcmp(name, "body") &&
@@ -1460,9 +1509,11 @@ static void grab_values(struct atom_value *val, int deref, struct object *obj, s
 		break;
 	case OBJ_TREE:
 		/* grab_tree_values(val, deref, obj, buf, sz); */
+		grab_sub_body_contents(val, deref, data);
 		break;
 	case OBJ_BLOB:
 		/* grab_blob_values(val, deref, obj, buf, sz); */
+		grab_sub_body_contents(val, deref, data);
 		break;
 	default:
 		die("Eh?  Object of type %d?", obj->type);
@@ -1766,6 +1817,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 		const char *refname;
 		struct branch *branch = NULL;
 
+		v->s_size = ATOM_VALUE_S_SIZE_INIT;
 		v->handler = append_atom;
 		v->atom = atom;
 
@@ -2369,6 +2421,19 @@ static int compare_detached_head(struct ref_array_item *a, struct ref_array_item
 	return 0;
 }
 
+static int memcasecmp(const void *vs1, const void *vs2, size_t n)
+{
+	const char *s1 = vs1, *s2 = vs2;
+	const char *end = s1 + n;
+
+	for (; s1 < end; s1++, s2++) {
+		int diff = tolower(*s1) - tolower(*s2);
+		if (diff)
+			return diff;
+	}
+	return 0;
+}
+
 static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, struct ref_array_item *b)
 {
 	struct atom_value *va, *vb;
@@ -2389,10 +2454,30 @@ static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, stru
 	} else if (s->sort_flags & REF_SORTING_VERSION) {
 		cmp = versioncmp(va->s, vb->s);
 	} else if (cmp_type == FIELD_STR) {
-		int (*cmp_fn)(const char *, const char *);
-		cmp_fn = s->sort_flags & REF_SORTING_ICASE
-			? strcasecmp : strcmp;
-		cmp = cmp_fn(va->s, vb->s);
+		if (va->s_size == ATOM_VALUE_S_SIZE_INIT &&
+		    vb->s_size == ATOM_VALUE_S_SIZE_INIT) {
+			int (*cmp_fn)(const char *, const char *);
+			cmp_fn = s->sort_flags & REF_SORTING_ICASE
+				? strcasecmp : strcmp;
+			cmp = cmp_fn(va->s, vb->s);
+		} else {
+			size_t a_size = va->s_size == ATOM_VALUE_S_SIZE_INIT ?
+					strlen(va->s) : va->s_size;
+			size_t b_size = vb->s_size == ATOM_VALUE_S_SIZE_INIT ?
+					strlen(vb->s) : vb->s_size;
+			int (*cmp_fn)(const void *, const void *, size_t);
+			cmp_fn = s->sort_flags & REF_SORTING_ICASE
+				? memcasecmp : memcmp;
+
+			cmp = cmp_fn(va->s, vb->s, b_size > a_size ?
+				     a_size : b_size);
+			if (!cmp) {
+				if (a_size > b_size)
+					cmp = 1;
+				else if (a_size < b_size)
+					cmp = -1;
+			}
+		}
 	} else {
 		if (va->value < vb->value)
 			cmp = -1;
@@ -2492,6 +2577,7 @@ int format_ref_array_item(struct ref_array_item *info,
 	}
 	if (format->need_color_reset_at_eol) {
 		struct atom_value resetv;
+		resetv.s_size = ATOM_VALUE_S_SIZE_INIT;
 		resetv.s = GIT_COLOR_RESET;
 		if (append_atom(&resetv, &state, error_buf)) {
 			pop_stack_element(&state.stack);
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 9e0214076b4..18554f62d94 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -130,6 +130,8 @@ test_atom head parent:short=10 ''
 test_atom head numparent 0
 test_atom head object ''
 test_atom head type ''
+test_atom head raw "$(git cat-file commit refs/heads/main)
+"
 test_atom head '*objectname' ''
 test_atom head '*objecttype' ''
 test_atom head author 'A U Thor <author@example.com> 1151968724 +0200'
@@ -221,6 +223,15 @@ test_atom tag contents 'Tagging at 1151968727
 '
 test_atom tag HEAD ' '
 
+test_expect_success 'basic atom: refs/tags/testtag *raw' '
+	git cat-file commit refs/tags/testtag^{} >expected &&
+	git for-each-ref --format="%(*raw)" refs/tags/testtag >actual &&
+	sanitize_pgp <expected >expected.clean &&
+	echo >>expected.clean &&
+	sanitize_pgp <actual >actual.clean &&
+	test_cmp expected.clean actual.clean
+'
+
 test_expect_success 'Check invalid atoms names are errors' '
 	test_must_fail git for-each-ref --format="%(INVALID)" refs/heads
 '
@@ -686,6 +697,15 @@ test_atom refs/tags/signed-empty contents:body ''
 test_atom refs/tags/signed-empty contents:signature "$sig"
 test_atom refs/tags/signed-empty contents "$sig"
 
+test_expect_success GPG 'basic atom: refs/tags/signed-empty raw' '
+	git cat-file tag refs/tags/signed-empty >expected &&
+	git for-each-ref --format="%(raw)" refs/tags/signed-empty >actual &&
+	sanitize_pgp <expected >expected.clean &&
+	echo >>expected.clean &&
+	sanitize_pgp <actual >actual.clean &&
+	test_cmp expected.clean actual.clean
+'
+
 test_atom refs/tags/signed-short subject 'subject line'
 test_atom refs/tags/signed-short subject:sanitize 'subject-line'
 test_atom refs/tags/signed-short contents:subject 'subject line'
@@ -695,6 +715,15 @@ test_atom refs/tags/signed-short contents:signature "$sig"
 test_atom refs/tags/signed-short contents "subject line
 $sig"
 
+test_expect_success GPG 'basic atom: refs/tags/signed-short raw' '
+	git cat-file tag refs/tags/signed-short >expected &&
+	git for-each-ref --format="%(raw)" refs/tags/signed-short >actual &&
+	sanitize_pgp <expected >expected.clean &&
+	echo >>expected.clean &&
+	sanitize_pgp <actual >actual.clean &&
+	test_cmp expected.clean actual.clean
+'
+
 test_atom refs/tags/signed-long subject 'subject line'
 test_atom refs/tags/signed-long subject:sanitize 'subject-line'
 test_atom refs/tags/signed-long contents:subject 'subject line'
@@ -708,6 +737,15 @@ test_atom refs/tags/signed-long contents "subject line
 body contents
 $sig"
 
+test_expect_success GPG 'basic atom: refs/tags/signed-long raw' '
+	git cat-file tag refs/tags/signed-long >expected &&
+	git for-each-ref --format="%(raw)" refs/tags/signed-long >actual &&
+	sanitize_pgp <expected >expected.clean &&
+	echo >>expected.clean &&
+	sanitize_pgp <actual >actual.clean &&
+	test_cmp expected.clean actual.clean
+'
+
 test_expect_success 'set up refs pointing to tree and blob' '
 	git update-ref refs/mytrees/first refs/heads/main^{tree} &&
 	git update-ref refs/myblobs/first refs/heads/main:one
@@ -720,6 +758,16 @@ test_atom refs/mytrees/first contents:body ""
 test_atom refs/mytrees/first contents:signature ""
 test_atom refs/mytrees/first contents ""
 
+test_expect_success 'basic atom: refs/mytrees/first raw' '
+	git cat-file tree refs/mytrees/first >expected &&
+	echo >>expected &&
+	git for-each-ref --format="%(raw)" refs/mytrees/first >actual &&
+	test_cmp expected actual &&
+	git cat-file -s refs/mytrees/first >expected &&
+	git for-each-ref --format="%(raw:size)" refs/mytrees/first >actual &&
+	test_cmp expected actual
+'
+
 test_atom refs/myblobs/first subject ""
 test_atom refs/myblobs/first contents:subject ""
 test_atom refs/myblobs/first body ""
@@ -727,6 +775,174 @@ test_atom refs/myblobs/first contents:body ""
 test_atom refs/myblobs/first contents:signature ""
 test_atom refs/myblobs/first contents ""
 
+test_expect_success 'basic atom: refs/myblobs/first raw' '
+	git cat-file blob refs/myblobs/first >expected &&
+	echo >>expected &&
+	git for-each-ref --format="%(raw)" refs/myblobs/first >actual &&
+	test_cmp expected actual &&
+	git cat-file -s refs/myblobs/first >expected &&
+	git for-each-ref --format="%(raw:size)" refs/myblobs/first >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'set up refs pointing to binary blob' '
+	printf "a\0b\0c" >blob1 &&
+	printf "a\0c\0b" >blob2 &&
+	printf "\0a\0b\0c" >blob3 &&
+	printf "abc" >blob4 &&
+	printf "\0 \0 \0 " >blob5 &&
+	printf "\0 \0a\0 " >blob6 &&
+	printf "  " >blob7 &&
+	>blob8 &&
+	obj=$(git hash-object -w blob1) &&
+	git update-ref refs/myblobs/blob1 "$obj" &&
+	obj=$(git hash-object -w blob2) &&
+	git update-ref refs/myblobs/blob2 "$obj" &&
+	obj=$(git hash-object -w blob3) &&
+	git update-ref refs/myblobs/blob3 "$obj" &&
+	obj=$(git hash-object -w blob4) &&
+	git update-ref refs/myblobs/blob4 "$obj" &&
+	obj=$(git hash-object -w blob5) &&
+	git update-ref refs/myblobs/blob5 "$obj" &&
+	obj=$(git hash-object -w blob6) &&
+	git update-ref refs/myblobs/blob6 "$obj" &&
+	obj=$(git hash-object -w blob7) &&
+	git update-ref refs/myblobs/blob7 "$obj" &&
+	obj=$(git hash-object -w blob8) &&
+	git update-ref refs/myblobs/blob8 "$obj"
+'
+
+test_expect_success 'Verify sorts with raw' '
+	cat >expected <<-EOF &&
+	refs/myblobs/blob8
+	refs/myblobs/blob5
+	refs/myblobs/blob6
+	refs/myblobs/blob3
+	refs/myblobs/blob7
+	refs/mytrees/first
+	refs/myblobs/first
+	refs/myblobs/blob1
+	refs/myblobs/blob2
+	refs/myblobs/blob4
+	refs/heads/main
+	EOF
+	git for-each-ref --format="%(refname)" --sort=raw \
+		refs/heads/main refs/myblobs/ refs/mytrees/first >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'Verify sorts with raw:size' '
+	cat >expected <<-EOF &&
+	refs/myblobs/blob8
+	refs/myblobs/first
+	refs/myblobs/blob7
+	refs/heads/main
+	refs/myblobs/blob4
+	refs/myblobs/blob1
+	refs/myblobs/blob2
+	refs/myblobs/blob3
+	refs/myblobs/blob5
+	refs/myblobs/blob6
+	refs/mytrees/first
+	EOF
+	git for-each-ref --format="%(refname)" --sort=raw:size \
+		refs/heads/main refs/myblobs/ refs/mytrees/first >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'validate raw atom with %(if:equals)' '
+	cat >expected <<-EOF &&
+	not equals
+	not equals
+	not equals
+	not equals
+	not equals
+	not equals
+	refs/myblobs/blob4
+	not equals
+	not equals
+	not equals
+	not equals
+	not equals
+	EOF
+	git for-each-ref --format="%(if:equals=abc)%(raw)%(then)%(refname)%(else)not equals%(end)" \
+		refs/myblobs/ refs/heads/ >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'validate raw atom with %(if:notequals)' '
+	cat >expected <<-EOF &&
+	refs/heads/ambiguous
+	refs/heads/main
+	refs/heads/newtag
+	refs/myblobs/blob1
+	refs/myblobs/blob2
+	refs/myblobs/blob3
+	equals
+	refs/myblobs/blob5
+	refs/myblobs/blob6
+	refs/myblobs/blob7
+	refs/myblobs/blob8
+	refs/myblobs/first
+	EOF
+	git for-each-ref --format="%(if:notequals=abc)%(raw)%(then)%(refname)%(else)equals%(end)" \
+		refs/myblobs/ refs/heads/ >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'empty raw refs with %(if)' '
+	cat >expected <<-EOF &&
+	refs/myblobs/blob1 not empty
+	refs/myblobs/blob2 not empty
+	refs/myblobs/blob3 not empty
+	refs/myblobs/blob4 not empty
+	refs/myblobs/blob5 not empty
+	refs/myblobs/blob6 not empty
+	refs/myblobs/blob7 empty
+	refs/myblobs/blob8 empty
+	refs/myblobs/first not empty
+	EOF
+	git for-each-ref --format="%(refname) %(if)%(raw)%(then)not empty%(else)empty%(end)" \
+		refs/myblobs/ >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success '%(raw) with --python must fail' '
+	test_must_fail git for-each-ref --format="%(raw)" --python
+'
+
+test_expect_success '%(raw) with --tcl must fail' '
+	test_must_fail git for-each-ref --format="%(raw)" --tcl
+'
+
+test_expect_success '%(raw) with --perl must fail' '
+	test_must_fail git for-each-ref --format="%(raw)" --perl
+'
+
+test_expect_success '%(raw) with --shell must fail' '
+	test_must_fail git for-each-ref --format="%(raw)" --shell
+'
+
+test_expect_success '%(raw) with --shell and --sort=raw must fail' '
+	test_must_fail git for-each-ref --format="%(raw)" --sort=raw --shell
+'
+
+test_expect_success '%(raw:size) with --shell' '
+	git for-each-ref --format="%(raw:size)" | while read line
+	do
+		echo "'\''$line'\''" >>expect
+	done &&
+	git for-each-ref --format="%(raw:size)" --shell >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'for-each-ref --format compare with cat-file --batch' '
+	git rev-parse refs/mytrees/first | git cat-file --batch >expected &&
+	git for-each-ref --format="%(objectname) %(objecttype) %(objectsize)
+%(raw)" refs/mytrees/first >actual &&
+	test_cmp expected actual
+'
+
 test_expect_success 'set up multiple-sort tags' '
 	for when in 100000 200000
 	do
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 3/5] [GSOC] ref-filter: --format=%(raw) re-support --perl
  2021-07-22  7:39 [PATCH 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms ZheNing Hu via GitGitGadget
  2021-07-22  7:39 ` [PATCH 1/5] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
  2021-07-22  7:39 ` [PATCH 2/5] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
@ 2021-07-22  7:39 ` ZheNing Hu via GitGitGadget
  2021-07-22  7:39 ` [PATCH 4/5] [GSOC] ref-filter: use non-const ref_format in *_atom_parser() ZheNing Hu via GitGitGadget
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 45+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-07-22  7:39 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
	Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Because the perl language can handle binary data correctly,
add the function perl_quote_buf_with_len(), which can specify
the length of the data and prevent the data from being truncated
at '\0' to help `--format="%(raw)"` re-support `--perl`.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 Documentation/git-for-each-ref.txt |  4 ++--
 quote.c                            | 17 +++++++++++++++++
 quote.h                            |  1 +
 ref-filter.c                       | 15 +++++++++++----
 t/t6300-for-each-ref.sh            | 19 +++++++++++++++++--
 5 files changed, 48 insertions(+), 8 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index cbb6f87d13f..6da899c6296 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -241,8 +241,8 @@ raw:size::
 	The raw data size of the object.
 
 Note that `--format=%(raw)` can not be used with `--python`, `--shell`, `--tcl`,
-`--perl` because such language may not support arbitrary binary data in their
-string variable type.
+because such language may not support arbitrary binary data in their string
+variable type.
 
 The message in a commit or a tag object is `contents`, from which
 `contents:<part>` can be used to extract various parts out of:
diff --git a/quote.c b/quote.c
index 8a3a5e39eb1..26719d21d1e 100644
--- a/quote.c
+++ b/quote.c
@@ -471,6 +471,23 @@ void perl_quote_buf(struct strbuf *sb, const char *src)
 	strbuf_addch(sb, sq);
 }
 
+void perl_quote_buf_with_len(struct strbuf *sb, const char *src, size_t len)
+{
+	const char sq = '\'';
+	const char bq = '\\';
+	const char *c = src;
+	const char *end = src + len;
+
+	strbuf_addch(sb, sq);
+	while (c != end) {
+		if (*c == sq || *c == bq)
+			strbuf_addch(sb, bq);
+		strbuf_addch(sb, *c);
+		c++;
+	}
+	strbuf_addch(sb, sq);
+}
+
 void python_quote_buf(struct strbuf *sb, const char *src)
 {
 	const char sq = '\'';
diff --git a/quote.h b/quote.h
index 768cc6338e2..0fe69e264b0 100644
--- a/quote.h
+++ b/quote.h
@@ -94,6 +94,7 @@ char *quote_path(const char *in, const char *prefix, struct strbuf *out, unsigne
 
 /* quoting as a string literal for other languages */
 void perl_quote_buf(struct strbuf *sb, const char *src);
+void perl_quote_buf_with_len(struct strbuf *sb, const char *src, size_t len);
 void python_quote_buf(struct strbuf *sb, const char *src);
 void tcl_quote_buf(struct strbuf *sb, const char *src);
 void basic_regex_quote_buf(struct strbuf *sb, const char *src);
diff --git a/ref-filter.c b/ref-filter.c
index 506fbc3d691..ba9ab35d7ec 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -742,7 +742,10 @@ static void quote_formatting(struct strbuf *s, const char *str, size_t len, int
 		sq_quote_buf(s, str);
 		break;
 	case QUOTE_PERL:
-		perl_quote_buf(s, str);
+		if (len != ATOM_VALUE_S_SIZE_INIT)
+			perl_quote_buf_with_len(s, str, len);
+		else
+			perl_quote_buf(s, str);
 		break;
 	case QUOTE_PYTHON:
 		python_quote_buf(s, str);
@@ -1006,10 +1009,14 @@ int verify_ref_format(struct ref_format *format)
 		at = parse_ref_filter_atom(format, sp + 2, ep, &err);
 		if (at < 0)
 			die("%s", err.buf);
-		if (format->quote_style && used_atom[at].atom_type == ATOM_RAW &&
-		    used_atom[at].u.raw_data.option == RAW_BARE)
+
+		if ((format->quote_style == QUOTE_PYTHON ||
+		     format->quote_style == QUOTE_SHELL ||
+		     format->quote_style == QUOTE_TCL) &&
+		     used_atom[at].atom_type == ATOM_RAW &&
+		     used_atom[at].u.raw_data.option == RAW_BARE)
 			die(_("--format=%.*s cannot be used with"
-			      "--python, --shell, --tcl, --perl"), (int)(ep - sp - 2), sp + 2);
+			      "--python, --shell, --tcl"), (int)(ep - sp - 2), sp + 2);
 		cp = ep + 1;
 
 		if (skip_prefix(used_atom[at].name, "color:", &color))
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 18554f62d94..3d15d0a5360 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -915,8 +915,23 @@ test_expect_success '%(raw) with --tcl must fail' '
 	test_must_fail git for-each-ref --format="%(raw)" --tcl
 '
 
-test_expect_success '%(raw) with --perl must fail' '
-	test_must_fail git for-each-ref --format="%(raw)" --perl
+test_expect_success '%(raw) with --perl' '
+	git for-each-ref --format="\$name= %(raw);
+print \"\$name\"" refs/myblobs/blob1 --perl | perl >actual &&
+	cmp blob1 actual &&
+	git for-each-ref --format="\$name= %(raw);
+print \"\$name\"" refs/myblobs/blob3 --perl | perl >actual &&
+	cmp blob3 actual &&
+	git for-each-ref --format="\$name= %(raw);
+print \"\$name\"" refs/myblobs/blob8 --perl | perl >actual &&
+	cmp blob8 actual &&
+	git for-each-ref --format="\$name= %(raw);
+print \"\$name\"" refs/myblobs/first --perl | perl >actual &&
+	cmp one actual &&
+	git cat-file tree refs/mytrees/first > expected &&
+	git for-each-ref --format="\$name= %(raw);
+print \"\$name\"" refs/mytrees/first --perl | perl >actual &&
+	cmp expected actual
 '
 
 test_expect_success '%(raw) with --shell must fail' '
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 4/5] [GSOC] ref-filter: use non-const ref_format in *_atom_parser()
  2021-07-22  7:39 [PATCH 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms ZheNing Hu via GitGitGadget
                   ` (2 preceding siblings ...)
  2021-07-22  7:39 ` [PATCH 3/5] [GSOC] ref-filter: --format=%(raw) re-support --perl ZheNing Hu via GitGitGadget
@ 2021-07-22  7:39 ` ZheNing Hu via GitGitGadget
  2021-07-22  7:39 ` [PATCH 5/5] [GSOC] ref-filter: add %(rest) atom ZheNing Hu via GitGitGadget
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 45+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-07-22  7:39 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
	Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Use non-const ref_format in *_atom_parser(), which can help us
modify the members of ref_format in *_atom_parser().

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 builtin/tag.c |  2 +-
 ref-filter.c  | 44 ++++++++++++++++++++++----------------------
 ref-filter.h  |  4 ++--
 3 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/builtin/tag.c b/builtin/tag.c
index 82fcfc09824..452558ec957 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -146,7 +146,7 @@ static int verify_tag(const char *name, const char *ref,
 		      const struct object_id *oid, void *cb_data)
 {
 	int flags;
-	const struct ref_format *format = cb_data;
+	struct ref_format *format = cb_data;
 	flags = GPG_VERIFY_VERBOSE;
 
 	if (format->format)
diff --git a/ref-filter.c b/ref-filter.c
index ba9ab35d7ec..c8e561a3687 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -226,7 +226,7 @@ static int strbuf_addf_ret(struct strbuf *sb, int ret, const char *fmt, ...)
 	return ret;
 }
 
-static int color_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int color_atom_parser(struct ref_format *format, struct used_atom *atom,
 			     const char *color_value, struct strbuf *err)
 {
 	if (!color_value)
@@ -264,7 +264,7 @@ static int refname_atom_parser_internal(struct refname_atom *atom, const char *a
 	return 0;
 }
 
-static int remote_ref_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int remote_ref_atom_parser(struct ref_format *format, struct used_atom *atom,
 				  const char *arg, struct strbuf *err)
 {
 	struct string_list params = STRING_LIST_INIT_DUP;
@@ -311,7 +311,7 @@ static int remote_ref_atom_parser(const struct ref_format *format, struct used_a
 	return 0;
 }
 
-static int objecttype_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int objecttype_atom_parser(struct ref_format *format, struct used_atom *atom,
 				  const char *arg, struct strbuf *err)
 {
 	if (arg)
@@ -323,7 +323,7 @@ static int objecttype_atom_parser(const struct ref_format *format, struct used_a
 	return 0;
 }
 
-static int objectsize_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int objectsize_atom_parser(struct ref_format *format, struct used_atom *atom,
 				  const char *arg, struct strbuf *err)
 {
 	if (!arg) {
@@ -343,7 +343,7 @@ static int objectsize_atom_parser(const struct ref_format *format, struct used_a
 	return 0;
 }
 
-static int deltabase_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int deltabase_atom_parser(struct ref_format *format, struct used_atom *atom,
 				 const char *arg, struct strbuf *err)
 {
 	if (arg)
@@ -355,7 +355,7 @@ static int deltabase_atom_parser(const struct ref_format *format, struct used_at
 	return 0;
 }
 
-static int body_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int body_atom_parser(struct ref_format *format, struct used_atom *atom,
 			    const char *arg, struct strbuf *err)
 {
 	if (arg)
@@ -364,7 +364,7 @@ static int body_atom_parser(const struct ref_format *format, struct used_atom *a
 	return 0;
 }
 
-static int subject_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int subject_atom_parser(struct ref_format *format, struct used_atom *atom,
 			       const char *arg, struct strbuf *err)
 {
 	if (!arg)
@@ -376,7 +376,7 @@ static int subject_atom_parser(const struct ref_format *format, struct used_atom
 	return 0;
 }
 
-static int trailers_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int trailers_atom_parser(struct ref_format *format, struct used_atom *atom,
 				const char *arg, struct strbuf *err)
 {
 	atom->u.contents.trailer_opts.no_divider = 1;
@@ -402,7 +402,7 @@ static int trailers_atom_parser(const struct ref_format *format, struct used_ato
 	return 0;
 }
 
-static int contents_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int contents_atom_parser(struct ref_format *format, struct used_atom *atom,
 				const char *arg, struct strbuf *err)
 {
 	if (!arg)
@@ -430,7 +430,7 @@ static int contents_atom_parser(const struct ref_format *format, struct used_ato
 	return 0;
 }
 
-static int raw_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int raw_atom_parser(struct ref_format *format, struct used_atom *atom,
 				const char *arg, struct strbuf *err)
 {
 	if (!arg)
@@ -442,7 +442,7 @@ static int raw_atom_parser(const struct ref_format *format, struct used_atom *at
 	return 0;
 }
 
-static int oid_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int oid_atom_parser(struct ref_format *format, struct used_atom *atom,
 			   const char *arg, struct strbuf *err)
 {
 	if (!arg)
@@ -461,7 +461,7 @@ static int oid_atom_parser(const struct ref_format *format, struct used_atom *at
 	return 0;
 }
 
-static int person_email_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int person_email_atom_parser(struct ref_format *format, struct used_atom *atom,
 				    const char *arg, struct strbuf *err)
 {
 	if (!arg)
@@ -475,7 +475,7 @@ static int person_email_atom_parser(const struct ref_format *format, struct used
 	return 0;
 }
 
-static int refname_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int refname_atom_parser(struct ref_format *format, struct used_atom *atom,
 			       const char *arg, struct strbuf *err)
 {
 	return refname_atom_parser_internal(&atom->u.refname, arg, atom->name, err);
@@ -492,7 +492,7 @@ static align_type parse_align_position(const char *s)
 	return -1;
 }
 
-static int align_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int align_atom_parser(struct ref_format *format, struct used_atom *atom,
 			     const char *arg, struct strbuf *err)
 {
 	struct align *align = &atom->u.align;
@@ -544,7 +544,7 @@ static int align_atom_parser(const struct ref_format *format, struct used_atom *
 	return 0;
 }
 
-static int if_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int if_atom_parser(struct ref_format *format, struct used_atom *atom,
 			  const char *arg, struct strbuf *err)
 {
 	if (!arg) {
@@ -559,7 +559,7 @@ static int if_atom_parser(const struct ref_format *format, struct used_atom *ato
 	return 0;
 }
 
-static int head_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int head_atom_parser(struct ref_format *format, struct used_atom *atom,
 			    const char *arg, struct strbuf *unused_err)
 {
 	atom->u.head = resolve_refdup("HEAD", RESOLVE_REF_READING, NULL, NULL);
@@ -570,7 +570,7 @@ static struct {
 	const char *name;
 	info_source source;
 	cmp_type cmp_type;
-	int (*parser)(const struct ref_format *format, struct used_atom *atom,
+	int (*parser)(struct ref_format *format, struct used_atom *atom,
 		      const char *arg, struct strbuf *err);
 } valid_atom[] = {
 	[ATOM_REFNAME] = { "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser },
@@ -649,7 +649,7 @@ struct atom_value {
 /*
  * Used to parse format string and sort specifiers
  */
-static int parse_ref_filter_atom(const struct ref_format *format,
+static int parse_ref_filter_atom(struct ref_format *format,
 				 const char *atom, const char *ep,
 				 struct strbuf *err)
 {
@@ -2554,9 +2554,9 @@ static void append_literal(const char *cp, const char *ep, struct ref_formatting
 }
 
 int format_ref_array_item(struct ref_array_item *info,
-			   const struct ref_format *format,
-			   struct strbuf *final_buf,
-			   struct strbuf *error_buf)
+			  struct ref_format *format,
+			  struct strbuf *final_buf,
+			  struct strbuf *error_buf)
 {
 	const char *cp, *sp, *ep;
 	struct ref_formatting_state state = REF_FORMATTING_STATE_INIT;
@@ -2601,7 +2601,7 @@ int format_ref_array_item(struct ref_array_item *info,
 }
 
 void pretty_print_ref(const char *name, const struct object_id *oid,
-		      const struct ref_format *format)
+		      struct ref_format *format)
 {
 	struct ref_array_item *ref_item;
 	struct strbuf output = STRBUF_INIT;
diff --git a/ref-filter.h b/ref-filter.h
index baf72a71896..74fb423fc89 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -116,7 +116,7 @@ void ref_array_sort(struct ref_sorting *sort, struct ref_array *array);
 void ref_sorting_set_sort_flags_all(struct ref_sorting *sorting, unsigned int mask, int on);
 /*  Based on the given format and quote_style, fill the strbuf */
 int format_ref_array_item(struct ref_array_item *info,
-			  const struct ref_format *format,
+			  struct ref_format *format,
 			  struct strbuf *final_buf,
 			  struct strbuf *error_buf);
 /*  Parse a single sort specifier and add it to the list */
@@ -137,7 +137,7 @@ void setup_ref_filter_porcelain_msg(void);
  * name must be a fully qualified refname.
  */
 void pretty_print_ref(const char *name, const struct object_id *oid,
-		      const struct ref_format *format);
+		      struct ref_format *format);
 
 /*
  * Push a single ref onto the array; this can be used to construct your own
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 5/5] [GSOC] ref-filter: add %(rest) atom
  2021-07-22  7:39 [PATCH 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms ZheNing Hu via GitGitGadget
                   ` (3 preceding siblings ...)
  2021-07-22  7:39 ` [PATCH 4/5] [GSOC] ref-filter: use non-const ref_format in *_atom_parser() ZheNing Hu via GitGitGadget
@ 2021-07-22  7:39 ` ZheNing Hu via GitGitGadget
  2021-07-22  8:11   ` Ævar Arnfjörð Bjarmason
  2021-07-23  9:09   ` Jacob Keller
  2021-07-22  8:08 ` [PATCH 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms Christian Couder
  2021-07-23  9:03 ` [PATCH v2 " ZheNing Hu via GitGitGadget
  6 siblings, 2 replies; 45+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-07-22  7:39 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
	Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

In order to let "cat-file --batch=%(rest)" use the ref-filter
interface, add %(rest) atom for ref-filter. Introduce the
reject_atom() to reject the atom %(rest) for "git for-each-ref",
"git branch", "git tag" and "git verify-tag".

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 ref-filter.c             | 25 +++++++++++++++++++++++++
 ref-filter.h             |  5 ++++-
 t/t3203-branch-output.sh |  4 ++++
 t/t6300-for-each-ref.sh  |  4 ++++
 t/t7004-tag.sh           |  4 ++++
 t/t7030-verify-tag.sh    |  4 ++++
 6 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/ref-filter.c b/ref-filter.c
index c8e561a3687..af8216dcd5b 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -157,6 +157,7 @@ enum atom_type {
 	ATOM_IF,
 	ATOM_THEN,
 	ATOM_ELSE,
+	ATOM_REST,
 };
 
 /*
@@ -559,6 +560,15 @@ static int if_atom_parser(struct ref_format *format, struct used_atom *atom,
 	return 0;
 }
 
+static int rest_atom_parser(struct ref_format *format, struct used_atom *atom,
+			    const char *arg, struct strbuf *err)
+{
+	if (arg)
+		return strbuf_addf_ret(err, -1, _("%%(rest) does not take arguments"));
+	format->use_rest = 1;
+	return 0;
+}
+
 static int head_atom_parser(struct ref_format *format, struct used_atom *atom,
 			    const char *arg, struct strbuf *unused_err)
 {
@@ -615,6 +625,7 @@ static struct {
 	[ATOM_IF] = { "if", SOURCE_NONE, FIELD_STR, if_atom_parser },
 	[ATOM_THEN] = { "then", SOURCE_NONE },
 	[ATOM_ELSE] = { "else", SOURCE_NONE },
+	[ATOM_REST] = { "rest", SOURCE_NONE, FIELD_STR, rest_atom_parser },
 	/*
 	 * Please update $__git_ref_fieldlist in git-completion.bash
 	 * when you add new atoms
@@ -989,6 +1000,11 @@ static const char *find_next(const char *cp)
 	return NULL;
 }
 
+static int reject_atom(enum atom_type atom_type)
+{
+	return atom_type == ATOM_REST;
+}
+
 /*
  * Make sure the format string is well formed, and parse out
  * the used atoms.
@@ -1009,6 +1025,8 @@ int verify_ref_format(struct ref_format *format)
 		at = parse_ref_filter_atom(format, sp + 2, ep, &err);
 		if (at < 0)
 			die("%s", err.buf);
+		if (reject_atom(used_atom[at].atom_type))
+			die(_("this command reject atom %%(%.*s)"), (int)(ep - sp - 2), sp + 2);
 
 		if ((format->quote_style == QUOTE_PYTHON ||
 		     format->quote_style == QUOTE_SHELL ||
@@ -1928,6 +1946,12 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 			v->handler = else_atom_handler;
 			v->s = xstrdup("");
 			continue;
+		} else if (atom_type == ATOM_REST) {
+			if (ref->rest)
+				v->s = xstrdup(ref->rest);
+			else
+				v->s = xstrdup("");
+			continue;
 		} else
 			continue;
 
@@ -2145,6 +2169,7 @@ static struct ref_array_item *new_ref_array_item(const char *refname,
 
 	FLEX_ALLOC_STR(ref, refname, refname);
 	oidcpy(&ref->objectname, oid);
+	ref->rest = NULL;
 
 	return ref;
 }
diff --git a/ref-filter.h b/ref-filter.h
index 74fb423fc89..c15dee8d6b9 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -38,6 +38,7 @@ struct ref_sorting {
 
 struct ref_array_item {
 	struct object_id objectname;
+	const char *rest;
 	int flag;
 	unsigned int kind;
 	const char *symref;
@@ -76,14 +77,16 @@ struct ref_format {
 	 * verify_ref_format() afterwards to finalize.
 	 */
 	const char *format;
+	const char *rest;
 	int quote_style;
+	int use_rest;
 	int use_color;
 
 	/* Internal state to ref-filter */
 	int need_color_reset_at_eol;
 };
 
-#define REF_FORMAT_INIT { NULL, 0, -1 }
+#define REF_FORMAT_INIT { .use_color = -1 }
 
 /*  Macros for checking --merged and --no-merged options */
 #define _OPT_MERGED_NO_MERGED(option, filter, h) \
diff --git a/t/t3203-branch-output.sh b/t/t3203-branch-output.sh
index 5325b9f67a0..6e94c6db7b5 100755
--- a/t/t3203-branch-output.sh
+++ b/t/t3203-branch-output.sh
@@ -340,6 +340,10 @@ test_expect_success 'git branch --format option' '
 	test_cmp expect actual
 '
 
+test_expect_success 'git branch with --format=%(rest) must fail' '
+	test_must_fail git branch --format="%(rest)" >actual
+'
+
 test_expect_success 'worktree colors correct' '
 	cat >expect <<-EOF &&
 	* <GREEN>(HEAD detached from fromtag)<RESET>
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 3d15d0a5360..0d2e062f791 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -1211,6 +1211,10 @@ test_expect_success 'basic atom: head contents:trailers' '
 	test_cmp expect actual.clean
 '
 
+test_expect_success 'basic atom: rest must fail' '
+	test_must_fail git for-each-ref --format="%(rest)" refs/heads/main
+'
+
 test_expect_success 'trailer parsing not fooled by --- line' '
 	git commit --allow-empty -F - <<-\EOF &&
 	this is the subject
diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh
index 2f72c5c6883..082be85dffc 100755
--- a/t/t7004-tag.sh
+++ b/t/t7004-tag.sh
@@ -1998,6 +1998,10 @@ test_expect_success '--format should list tags as per format given' '
 	test_cmp expect actual
 '
 
+test_expect_success 'git tag -l with --format="%(rest)" must fail' '
+	test_must_fail git tag -l --format="%(rest)" "v1*"
+'
+
 test_expect_success "set up color tests" '
 	echo "<RED>v1.0<RESET>" >expect.color &&
 	echo "v1.0" >expect.bare &&
diff --git a/t/t7030-verify-tag.sh b/t/t7030-verify-tag.sh
index 3cefde9602b..10faa645157 100755
--- a/t/t7030-verify-tag.sh
+++ b/t/t7030-verify-tag.sh
@@ -194,6 +194,10 @@ test_expect_success GPG 'verifying tag with --format' '
 	test_cmp expect actual
 '
 
+test_expect_success GPG 'verifying tag with --format="%(rest)" must fail' '
+	test_must_fail git verify-tag --format="%(rest)" "fourth-signed"
+'
+
 test_expect_success GPG 'verifying a forged tag with --format should fail silently' '
 	test_must_fail git verify-tag --format="tagname : %(tag)" $(cat forged1.tag) >actual-forged &&
 	test_must_be_empty actual-forged
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/5] [GSOC] ref-filter: add %(raw) atom
  2021-07-22  7:39 ` [PATCH 2/5] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
@ 2021-07-22  8:06   ` Ævar Arnfjörð Bjarmason
  2021-07-22  8:31     ` ZheNing Hu
  0 siblings, 1 reply; 45+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22  8:06 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Junio C Hamano, Christian Couder, Hariom Verma,
	Bagas Sanjaya, Jeff King, Eric Sunshine, Philip Oakley,
	ZheNing Hu


On Thu, Jul 22 2021, ZheNing Hu via GitGitGadget wrote:

> [..] add a new member in
> `struct atom_value`: `s_size`, which can record raw object size, it
> can help us add raw object data to the buffer or compare two buffers
> which contain raw object data.

I didn't look in detail, just one nit/comment, no need to re-roll for this:


>  struct atom_value {
>  	const char *s;
> +	size_t s_size;
>  	int (*handler)(struct atom_value *atomv, struct ref_formatting_state *state,
>  		       struct strbuf *err);
>  	uintmax_t value; /* used for sorting when not FIELD_STR */
>  	struct used_atom *atom;
>  };
>  
> +#define ATOM_VALUE_S_SIZE_INIT (-1)
> +

This and then later doing (this appears to be the only initialization?):

>  	if (format->need_color_reset_at_eol) {
>  		struct atom_value resetv;
> +		resetv.s_size = ATOM_VALUE_S_SIZE_INIT;
>  		resetv.s = GIT_COLOR_RESET;
>  		if (append_atom(&resetv, &state, error_buf)) {
>  			pop_stack_element(&state.stack);

Is a rather unusual pattern, more typical would be:

	struct atom_value { .... }
	#define ATOM_VALUE_INIT { \
		.s_size = -1, \
	}

Then:

	struct atom_value resetv = ATOM_VALUE_INIT;

You could keep that "#define ATOM_VALUE_S_SIZE_INIT (-1)" to check the
init value, personally I think it's just fine to hardcode the -1 literal
and do "size < 0" checks, which we us as convention in a lot of other
places for "not initialized".

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms
  2021-07-22  7:39 [PATCH 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms ZheNing Hu via GitGitGadget
                   ` (4 preceding siblings ...)
  2021-07-22  7:39 ` [PATCH 5/5] [GSOC] ref-filter: add %(rest) atom ZheNing Hu via GitGitGadget
@ 2021-07-22  8:08 ` Christian Couder
  2021-07-22  8:22   ` ZheNing Hu
  2021-07-23  9:03 ` [PATCH v2 " ZheNing Hu via GitGitGadget
  6 siblings, 1 reply; 45+ messages in thread
From: Christian Couder @ 2021-07-22  8:08 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Junio C Hamano, Hariom Verma, Bagas Sanjaya, Jeff King,
	Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, ZheNing Hu

On Thu, Jul 22, 2021 at 9:39 AM ZheNing Hu via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> This patch series is split from my main patch series zh/ref-filter-raw-data,
> to make things easier for reviewers to read.

Thanks for doing that!

> This patch series provided %(raw) and %(rest) for ref-filter, which will be
> used by zh/cat-file-reuse-ref-filter-logic later.

Great!

It would be nice if you could also point to the previous times when
you sent this to the mailing list, so that previous reviews can more
easily be found and maybe checked.

Another nice improvement would be to link to a branch on GitHub or
GitLab, so that people could more easily fetch your work and try it.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 5/5] [GSOC] ref-filter: add %(rest) atom
  2021-07-22  7:39 ` [PATCH 5/5] [GSOC] ref-filter: add %(rest) atom ZheNing Hu via GitGitGadget
@ 2021-07-22  8:11   ` Ævar Arnfjörð Bjarmason
  2021-07-22  9:10     ` ZheNing Hu
  2021-07-23  9:09   ` Jacob Keller
  1 sibling, 1 reply; 45+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-22  8:11 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Junio C Hamano, Christian Couder, Hariom Verma,
	Bagas Sanjaya, Jeff King, Eric Sunshine, Philip Oakley,
	ZheNing Hu


On Thu, Jul 22 2021, ZheNing Hu via GitGitGadget wrote:


> +		} else if (atom_type == ATOM_REST) {
> +			if (ref->rest)
> +				v->s = xstrdup(ref->rest);
> +			else
> +				v->s = xstrdup("");
> +			continue;
>  		} else
>  			continue;

Another light reading nit, maybe more readable as:

    } else if (atom_type == ATOM_REST && ref->rest) {
        ...
    } else if (atom_type == ATOM_REST) {
        ...
    }
    continue;

But we can't do that "continue" at the end because there's two cases
where we don't continue, I wondered if elimanting that special case
would make it easier, patch below. Probably not worth it & feel free to
ignore:

 ref-filter.c | 63 +++++++++++++++++++++++++-----------------------------------
 1 file changed, 26 insertions(+), 37 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 81e77b13ad2..189244fed6f 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1890,18 +1890,26 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 			name++;
 		}
 
-		if (atom_type == ATOM_REFNAME)
+		if (atom_type == ATOM_REFNAME) {
 			refname = get_refname(atom, ref);
-		else if (atom_type == ATOM_WORKTREEPATH) {
-			if (ref->kind == FILTER_REFS_BRANCHES)
-				v->s = get_worktree_path(atom, ref);
+			if (!deref)
+				v->s = xstrdup(refname);
 			else
-				v->s = xstrdup("");
-			continue;
-		}
-		else if (atom_type == ATOM_SYMREF)
+				v->s = xstrfmt("%s^{}", refname);
+			free((char *)refname);
+		} else if (atom_type == ATOM_WORKTREEPATH &&
+			   ref->kind == FILTER_REFS_BRANCHES) {
+			v->s = get_worktree_path(atom, ref);
+		} else if (atom_type == ATOM_WORKTREEPATH) {
+			v->s = xstrdup("");
+		} else if (atom_type == ATOM_SYMREF) {
 			refname = get_symref(atom, ref);
-		else if (atom_type == ATOM_UPSTREAM) {
+			if (!deref)
+				v->s = xstrdup(refname);
+			else
+				v->s = xstrfmt("%s^{}", refname);
+			free((char *)refname);
+		} else if (atom_type == ATOM_UPSTREAM) {
 			const char *branch_name;
 			/* only local branches may have an upstream */
 			if (!skip_prefix(ref->refname, "refs/heads/",
@@ -1916,7 +1924,6 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 				fill_remote_ref_details(atom, refname, branch, &v->s);
 			else
 				v->s = xstrdup("");
-			continue;
 		} else if (atom_type == ATOM_PUSH && atom->u.remote_ref.push) {
 			const char *branch_name;
 			v->s = xstrdup("");
@@ -1935,10 +1942,8 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 			/* We will definitely re-init v->s on the next line. */
 			free((char *)v->s);
 			fill_remote_ref_details(atom, refname, branch, &v->s);
-			continue;
 		} else if (atom_type == ATOM_COLOR) {
 			v->s = xstrdup(atom->u.color);
-			continue;
 		} else if (atom_type == ATOM_FLAG) {
 			char buf[256], *cp = buf;
 			if (ref->flag & REF_ISSYMREF)
@@ -1951,24 +1956,20 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 				*cp = '\0';
 				v->s = xstrdup(buf + 1);
 			}
-			continue;
 		} else if (!deref && atom_type == ATOM_OBJECTNAME) {
 			   v->s = xstrdup(do_grab_oid("objectname", &ref->objectname, atom));
-			   continue;
+		} else if (atom_type == ATOM_HEAD &&
+			   atom->u.head &&
+			   !strcmp(ref->refname, atom->u.head)) {
+			v->s = xstrdup("*");
 		} else if (atom_type == ATOM_HEAD) {
-			if (atom->u.head && !strcmp(ref->refname, atom->u.head))
-				v->s = xstrdup("*");
-			else
-				v->s = xstrdup(" ");
-			continue;
+			v->s = xstrdup(" ");
 		} else if (atom_type == ATOM_ALIGN) {
 			v->handler = align_atom_handler;
 			v->s = xstrdup("");
-			continue;
 		} else if (atom_type == ATOM_END) {
 			v->handler = end_atom_handler;
 			v->s = xstrdup("");
-			continue;
 		} else if (atom_type == ATOM_IF) {
 			const char *s;
 			if (skip_prefix(name, "if:", &s))
@@ -1976,29 +1977,17 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 			else
 				v->s = xstrdup("");
 			v->handler = if_atom_handler;
-			continue;
 		} else if (atom_type == ATOM_THEN) {
 			v->handler = then_atom_handler;
 			v->s = xstrdup("");
-			continue;
 		} else if (atom_type == ATOM_ELSE) {
 			v->handler = else_atom_handler;
 			v->s = xstrdup("");
-			continue;
+		} else if (atom_type == ATOM_REST && ref->rest) {
+			v->s = xstrdup(ref->rest);
 		} else if (atom_type == ATOM_REST) {
-			if (ref->rest)
-				v->s = xstrdup(ref->rest);
-			else
-				v->s = xstrdup("");
-			continue;
-		} else
-			continue;
-
-		if (!deref)
-			v->s = xstrdup(refname);
-		else
-			v->s = xstrfmt("%s^{}", refname);
-		free((char *)refname);
+			v->s = xstrdup("");
+		}
 	}
 
 	for (i = 0; i < used_atom_cnt; i++) {

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms
  2021-07-22  8:08 ` [PATCH 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms Christian Couder
@ 2021-07-22  8:22   ` ZheNing Hu
  0 siblings, 0 replies; 45+ messages in thread
From: ZheNing Hu @ 2021-07-22  8:22 UTC (permalink / raw)
  To: Christian Couder
  Cc: ZheNing Hu via GitGitGadget, git, Junio C Hamano, Hariom Verma,
	Bagas Sanjaya, Jeff King, Ævar Arnfjörð Bjarmason,
	Eric Sunshine, Philip Oakley

Christian Couder <christian.couder@gmail.com> 于2021年7月22日周四 下午4:08写道:
>
> On Thu, Jul 22, 2021 at 9:39 AM ZheNing Hu via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> >
> > This patch series is split from my main patch series zh/ref-filter-raw-data,
> > to make things easier for reviewers to read.
>
> Thanks for doing that!
>
> > This patch series provided %(raw) and %(rest) for ref-filter, which will be
> > used by zh/cat-file-reuse-ref-filter-logic later.
>
> Great!
>
> It would be nice if you could also point to the previous times when
> you sent this to the mailing list, so that previous reviews can more
> easily be found and maybe checked.
>

Okey, I will do it next time.

> Another nice improvement would be to link to a branch on GitHub or
> GitLab, so that people could more easily fetch your work and try it.

Thanks.
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 2/5] [GSOC] ref-filter: add %(raw) atom
  2021-07-22  8:06   ` Ævar Arnfjörð Bjarmason
@ 2021-07-22  8:31     ` ZheNing Hu
  0 siblings, 0 replies; 45+ messages in thread
From: ZheNing Hu @ 2021-07-22  8:31 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: ZheNing Hu via GitGitGadget, Git List, Junio C Hamano,
	Christian Couder, Hariom Verma, Bagas Sanjaya, Jeff King,
	Eric Sunshine, Philip Oakley

Ævar Arnfjörð Bjarmason <avarab@gmail.com> 于2021年7月22日周四 下午4:10写道:

>
>
> On Thu, Jul 22 2021, ZheNing Hu via GitGitGadget wrote:
>
> > [..] add a new member in
> > `struct atom_value`: `s_size`, which can record raw object size, it
> > can help us add raw object data to the buffer or compare two buffers
> > which contain raw object data.
>
> I didn't look in detail, just one nit/comment, no need to re-roll for this:
>
>
> >  struct atom_value {
> >       const char *s;
> > +     size_t s_size;
> >       int (*handler)(struct atom_value *atomv, struct ref_formatting_state *state,
> >                      struct strbuf *err);
> >       uintmax_t value; /* used for sorting when not FIELD_STR */
> >       struct used_atom *atom;
> >  };
> >
> > +#define ATOM_VALUE_S_SIZE_INIT (-1)
> > +
>
> This and then later doing (this appears to be the only initialization?):

Yes, "s_size = -1” is the only initialization.

>
> >       if (format->need_color_reset_at_eol) {
> >               struct atom_value resetv;
> > +             resetv.s_size = ATOM_VALUE_S_SIZE_INIT;
> >               resetv.s = GIT_COLOR_RESET;
> >               if (append_atom(&resetv, &state, error_buf)) {
> >                       pop_stack_element(&state.stack);
>
> Is a rather unusual pattern, more typical would be:
>
>         struct atom_value { .... }
>         #define ATOM_VALUE_INIT { \
>                 .s_size = -1, \
>         }
>
> Then:
>
>         struct atom_value resetv = ATOM_VALUE_INIT;
>
> You could keep that "#define ATOM_VALUE_S_SIZE_INIT (-1)" to check the
> init value, personally I think it's just fine to hardcode the -1 literal
> and do "size < 0" checks, which we us as convention in a lot of other
> places for "not initialized".

Maybe you are right, hard-coding "-1" is ok.

Thanks.
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/5] [GSOC] ref-filter: add obj-type check in grab contents
  2021-07-22  7:39 ` [PATCH 1/5] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
@ 2021-07-22  8:51   ` Christian Couder
  2021-07-22  9:15     ` ZheNing Hu
  0 siblings, 1 reply; 45+ messages in thread
From: Christian Couder @ 2021-07-22  8:51 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Junio C Hamano, Hariom Verma, Bagas Sanjaya, Jeff King,
	Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, ZheNing Hu

On Thu, Jul 22, 2021 at 9:39 AM ZheNing Hu via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: ZheNing Hu <adlternative@gmail.com>
>
> Only tag and commit objects use `grab_sub_body_contents()` to grab
> object contents in the current codebase.  We want to teach the
> function to also handle blobs and trees to get their raw data,
> without parsing a blob (whose contents looks like a commit or a tag)
> incorrectly as a commit or a tag.
>
> Skip the block of code that is specific to handling commits and tags
> early when the given object is of a wrong type to help later
> addition to handle other types of objects in this function.

Small nit in case the series is rerolled: it looks like this commit
would change only `grab_sub_body_contents()`, but there are a few
changes to grab_values() too. Maybe you could add that it's needed to
pass a `struct expand_data *data` instead of only `void *buf` to both
`grab_sub_body_contents()` and `grab_values()` to be able to check the
object type.


> Mentored-by: Christian Couder <christian.couder@gmail.com>
> Mentored-by: Hariom Verma <hariom18599@gmail.com>
> Helped-by: Junio C Hamano <gitster@pobox.com>
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
>  ref-filter.c | 24 +++++++++++++++---------
>  1 file changed, 15 insertions(+), 9 deletions(-)
>
> diff --git a/ref-filter.c b/ref-filter.c
> index 4db0e40ff4c..5cee6512fba 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -1356,11 +1356,12 @@ static void append_lines(struct strbuf *out, const char *buf, unsigned long size
>  }
>
>  /* See grab_values */
> -static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
> +static void grab_sub_body_contents(struct atom_value *val, int deref, struct expand_data *data)
>  {
>         int i;
>         const char *subpos = NULL, *bodypos = NULL, *sigpos = NULL;
>         size_t sublen = 0, bodylen = 0, nonsiglen = 0, siglen = 0;
> +       void *buf = data->content;
>
>         for (i = 0; i < used_atom_cnt; i++) {
>                 struct used_atom *atom = &used_atom[i];
> @@ -1371,10 +1372,13 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
>                         continue;
>                 if (deref)
>                         name++;
> -               if (strcmp(name, "body") &&
> -                   !starts_with(name, "subject") &&
> -                   !starts_with(name, "trailers") &&
> -                   !starts_with(name, "contents"))
> +
> +               if ((data->type != OBJ_TAG &&
> +                    data->type != OBJ_COMMIT) ||
> +                   (strcmp(name, "body") &&
> +                    !starts_with(name, "subject") &&
> +                    !starts_with(name, "trailers") &&
> +                    !starts_with(name, "contents")))
>                         continue;
>                 if (!subpos)
>                         find_subpos(buf,
> @@ -1438,17 +1442,19 @@ static void fill_missing_values(struct atom_value *val)
>   * pointed at by the ref itself; otherwise it is the object the
>   * ref (which is a tag) refers to.
>   */
> -static void grab_values(struct atom_value *val, int deref, struct object *obj, void *buf)
> +static void grab_values(struct atom_value *val, int deref, struct object *obj, struct expand_data *data)
>  {
> +       void *buf = data->content;
> +
>         switch (obj->type) {
>         case OBJ_TAG:
>                 grab_tag_values(val, deref, obj);
> -               grab_sub_body_contents(val, deref, buf);
> +               grab_sub_body_contents(val, deref, data);
>                 grab_person("tagger", val, deref, buf);
>                 break;
>         case OBJ_COMMIT:
>                 grab_commit_values(val, deref, obj);
> -               grab_sub_body_contents(val, deref, buf);
> +               grab_sub_body_contents(val, deref, data);
>                 grab_person("author", val, deref, buf);
>                 grab_person("committer", val, deref, buf);
>                 break;
> @@ -1678,7 +1684,7 @@ static int get_object(struct ref_array_item *ref, int deref, struct object **obj
>                         return strbuf_addf_ret(err, -1, _("parse_object_buffer failed on %s for %s"),
>                                                oid_to_hex(&oi->oid), ref->refname);
>                 }
> -               grab_values(ref->value, deref, *obj, oi->content);
> +               grab_values(ref->value, deref, *obj, oi);
>         }
>
>         grab_common_values(ref->value, deref, oi);
> --
> gitgitgadget
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 5/5] [GSOC] ref-filter: add %(rest) atom
  2021-07-22  8:11   ` Ævar Arnfjörð Bjarmason
@ 2021-07-22  9:10     ` ZheNing Hu
  0 siblings, 0 replies; 45+ messages in thread
From: ZheNing Hu @ 2021-07-22  9:10 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: ZheNing Hu via GitGitGadget, Git List, Junio C Hamano,
	Christian Couder, Hariom Verma, Bagas Sanjaya, Jeff King,
	Eric Sunshine, Philip Oakley

Ævar Arnfjörð Bjarmason <avarab@gmail.com> 于2021年7月22日周四 下午4:24写道:
>
>
> On Thu, Jul 22 2021, ZheNing Hu via GitGitGadget wrote:
>
>
> > +             } else if (atom_type == ATOM_REST) {
> > +                     if (ref->rest)
> > +                             v->s = xstrdup(ref->rest);
> > +                     else
> > +                             v->s = xstrdup("");
> > +                     continue;
> >               } else
> >                       continue;
>
> Another light reading nit, maybe more readable as:
>
>     } else if (atom_type == ATOM_REST && ref->rest) {
>         ...
>     } else if (atom_type == ATOM_REST) {
>         ...
>     }
>     continue;
>
> But we can't do that "continue" at the end because there's two cases
> where we don't continue, I wondered if elimanting that special case
> would make it easier, patch below. Probably not worth it & feel free to
> ignore:

Yeah, the readability here is really bad, so many "continue" just to not execute
part of the operation on refname. In fact, I have a similar patch
(which will not
be sent to the mailing list for the time being), it use switch and
case to replace if
and else. [1]

>
>  ref-filter.c | 63 +++++++++++++++++++++++++-----------------------------------
>  1 file changed, 26 insertions(+), 37 deletions(-)
>
> diff --git a/ref-filter.c b/ref-filter.c
> index 81e77b13ad2..189244fed6f 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -1890,18 +1890,26 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
>                         name++;
>                 }
>
> -               if (atom_type == ATOM_REFNAME)
> +               if (atom_type == ATOM_REFNAME) {
>                         refname = get_refname(atom, ref);
> -               else if (atom_type == ATOM_WORKTREEPATH) {
> -                       if (ref->kind == FILTER_REFS_BRANCHES)
> -                               v->s = get_worktree_path(atom, ref);
> +                       if (!deref)
> +                               v->s = xstrdup(refname);
>                         else
> -                               v->s = xstrdup("");
> -                       continue;
> -               }
> -               else if (atom_type == ATOM_SYMREF)
> +                               v->s = xstrfmt("%s^{}", refname);
> +                       free((char *)refname);
> +               } else if (atom_type == ATOM_WORKTREEPATH &&
> +                          ref->kind == FILTER_REFS_BRANCHES) {
> +                       v->s = get_worktree_path(atom, ref);
> +               } else if (atom_type == ATOM_WORKTREEPATH) {
> +                       v->s = xstrdup("");
> +               } else if (atom_type == ATOM_SYMREF) {
>                         refname = get_symref(atom, ref);
> -               else if (atom_type == ATOM_UPSTREAM) {
> +                       if (!deref)
> +                               v->s = xstrdup(refname);
> +                       else
> +                               v->s = xstrfmt("%s^{}", refname);
> +                       free((char *)refname);
> +               } else if (atom_type == ATOM_UPSTREAM) {
>                         const char *branch_name;
>                         /* only local branches may have an upstream */
>                         if (!skip_prefix(ref->refname, "refs/heads/",
> @@ -1916,7 +1924,6 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
>                                 fill_remote_ref_details(atom, refname, branch, &v->s);
>                         else
>                                 v->s = xstrdup("");
> -                       continue;
>                 } else if (atom_type == ATOM_PUSH && atom->u.remote_ref.push) {
>                         const char *branch_name;
>                         v->s = xstrdup("");
> @@ -1935,10 +1942,8 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
>                         /* We will definitely re-init v->s on the next line. */
>                         free((char *)v->s);
>                         fill_remote_ref_details(atom, refname, branch, &v->s);
> -                       continue;
>                 } else if (atom_type == ATOM_COLOR) {
>                         v->s = xstrdup(atom->u.color);
> -                       continue;
>                 } else if (atom_type == ATOM_FLAG) {
>                         char buf[256], *cp = buf;
>                         if (ref->flag & REF_ISSYMREF)
> @@ -1951,24 +1956,20 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
>                                 *cp = '\0';
>                                 v->s = xstrdup(buf + 1);
>                         }
> -                       continue;
>                 } else if (!deref && atom_type == ATOM_OBJECTNAME) {
>                            v->s = xstrdup(do_grab_oid("objectname", &ref->objectname, atom));
> -                          continue;
> +               } else if (atom_type == ATOM_HEAD &&
> +                          atom->u.head &&
> +                          !strcmp(ref->refname, atom->u.head)) {
> +                       v->s = xstrdup("*");
>                 } else if (atom_type == ATOM_HEAD) {
> -                       if (atom->u.head && !strcmp(ref->refname, atom->u.head))
> -                               v->s = xstrdup("*");
> -                       else
> -                               v->s = xstrdup(" ");
> -                       continue;
> +                       v->s = xstrdup(" ");
>                 } else if (atom_type == ATOM_ALIGN) {
>                         v->handler = align_atom_handler;
>                         v->s = xstrdup("");
> -                       continue;
>                 } else if (atom_type == ATOM_END) {
>                         v->handler = end_atom_handler;
>                         v->s = xstrdup("");
> -                       continue;
>                 } else if (atom_type == ATOM_IF) {
>                         const char *s;
>                         if (skip_prefix(name, "if:", &s))
> @@ -1976,29 +1977,17 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
>                         else
>                                 v->s = xstrdup("");
>                         v->handler = if_atom_handler;
> -                       continue;
>                 } else if (atom_type == ATOM_THEN) {
>                         v->handler = then_atom_handler;
>                         v->s = xstrdup("");
> -                       continue;
>                 } else if (atom_type == ATOM_ELSE) {
>                         v->handler = else_atom_handler;
>                         v->s = xstrdup("");
> -                       continue;
> +               } else if (atom_type == ATOM_REST && ref->rest) {
> +                       v->s = xstrdup(ref->rest);
>                 } else if (atom_type == ATOM_REST) {
> -                       if (ref->rest)
> -                               v->s = xstrdup(ref->rest);
> -                       else
> -                               v->s = xstrdup("");
> -                       continue;
> -               } else
> -                       continue;
> -
> -               if (!deref)
> -                       v->s = xstrdup(refname);
> -               else
> -                       v->s = xstrfmt("%s^{}", refname);
> -               free((char *)refname);
> +                       v->s = xstrdup("");
> +               }
>         }
>
>         for (i = 0; i < used_atom_cnt; i++) {

[1]: https://github.com/adlternative/git/commit/bbc6ecba4c0f65e7328e571b0d38ff99f800dc09

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/5] [GSOC] ref-filter: add obj-type check in grab contents
  2021-07-22  8:51   ` Christian Couder
@ 2021-07-22  9:15     ` ZheNing Hu
  0 siblings, 0 replies; 45+ messages in thread
From: ZheNing Hu @ 2021-07-22  9:15 UTC (permalink / raw)
  To: Christian Couder
  Cc: ZheNing Hu via GitGitGadget, git, Junio C Hamano, Hariom Verma,
	Bagas Sanjaya, Jeff King, Ævar Arnfjörð Bjarmason,
	Eric Sunshine, Philip Oakley

Christian Couder <christian.couder@gmail.com> 于2021年7月22日周四 下午4:51写道:
>
> On Thu, Jul 22, 2021 at 9:39 AM ZheNing Hu via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> >
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > Only tag and commit objects use `grab_sub_body_contents()` to grab
> > object contents in the current codebase.  We want to teach the
> > function to also handle blobs and trees to get their raw data,
> > without parsing a blob (whose contents looks like a commit or a tag)
> > incorrectly as a commit or a tag.
> >
> > Skip the block of code that is specific to handling commits and tags
> > early when the given object is of a wrong type to help later
> > addition to handle other types of objects in this function.
>
> Small nit in case the series is rerolled: it looks like this commit
> would change only `grab_sub_body_contents()`, but there are a few
> changes to grab_values() too. Maybe you could add that it's needed to
> pass a `struct expand_data *data` instead of only `void *buf` to both
> `grab_sub_body_contents()` and `grab_values()` to be able to check the
> object type.
>
Indeed so. I will add them.

Thanks.
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms
  2021-07-22  7:39 [PATCH 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms ZheNing Hu via GitGitGadget
                   ` (5 preceding siblings ...)
  2021-07-22  8:08 ` [PATCH 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms Christian Couder
@ 2021-07-23  9:03 ` ZheNing Hu via GitGitGadget
  2021-07-23  9:03   ` [PATCH v2 1/5] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
                     ` (5 more replies)
  6 siblings, 6 replies; 45+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-07-23  9:03 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
	Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, ZheNing Hu

This patch series is split from my main patch series zh/ref-filter-raw-data,
to make things easier for reviewers to read. It's last version is here:
https://lore.kernel.org/git/pull.993.v2.git.1626363626.gitgitgadget@gmail.com/

This patch series provided %(raw) and %(rest) for ref-filter, which will be
used by zh/cat-file-reuse-ref-filter-logic later.

Change from last version:

 1. Define ATOM_VALUE_INIT used to initialize struct atom_value and delete
    ATOM_VALUE_S_SIZE_INIT macro.
 2. Provide more complete commit message.

ZheNing Hu (5):
  [GSOC] ref-filter: add obj-type check in grab contents
  [GSOC] ref-filter: add %(raw) atom
  [GSOC] ref-filter: --format=%(raw) re-support --perl
  [GSOC] ref-filter: use non-const ref_format in *_atom_parser()
  [GSOC] ref-filter: add %(rest) atom

 Documentation/git-for-each-ref.txt |   9 ++
 builtin/tag.c                      |   2 +-
 quote.c                            |  17 ++
 quote.h                            |   1 +
 ref-filter.c                       | 242 ++++++++++++++++++++++-------
 ref-filter.h                       |   9 +-
 t/t3203-branch-output.sh           |   4 +
 t/t6300-for-each-ref.sh            | 235 ++++++++++++++++++++++++++++
 t/t7004-tag.sh                     |   4 +
 t/t7030-verify-tag.sh              |   4 +
 10 files changed, 464 insertions(+), 63 deletions(-)


base-commit: daab8a564f8bbac55f70f8bf86c070e001a9b006
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1000%2Fadlternative%2Fref-filter-raw-data-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1000/adlternative/ref-filter-raw-data-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1000

Range-diff vs v1:

 1:  a7b4e45f14a ! 1:  d77268bb188 [GSOC] ref-filter: add obj-type check in grab contents
     @@ Commit message
          object contents in the current codebase.  We want to teach the
          function to also handle blobs and trees to get their raw data,
          without parsing a blob (whose contents looks like a commit or a tag)
     -    incorrectly as a commit or a tag.
     +    incorrectly as a commit or a tag. So it's needed to pass a
     +    `struct expand_data *data` instread of only `void *buf` to both
     +    `grab_sub_body_contents()` and `grab_values()` to be able to check
     +    the object type.
      
          Skip the block of code that is specific to handling commits and tags
          early when the given object is of a wrong type to help later
 2:  ecd41b370e6 ! 2:  eafb79bad62 [GSOC] ref-filter: add %(raw) atom
     @@ ref-filter.c: struct ref_formatting_state {
       	struct used_atom *atom;
       };
       
     -+#define ATOM_VALUE_S_SIZE_INIT (-1)
     ++#define ATOM_VALUE_INIT { \
     ++	.s_size = -1 \
     ++}
      +
       /*
        * Used to parse format string and sort specifiers
     @@ ref-filter.c: static int parse_ref_filter_atom(const struct ref_format *format,
       	switch (quote_style) {
       	case QUOTE_NONE:
      -		strbuf_addstr(s, str);
     -+		if (len != ATOM_VALUE_S_SIZE_INIT)
     ++		if (len != -1)
      +			strbuf_add(s, str, len);
      +		else
      +			strbuf_addstr(s, str);
     @@ ref-filter.c: static int append_atom(struct atom_value *v, struct ref_formatting
      +		quote_formatting(&state->stack->output, v->s, v->s_size, state->quote_style);
       	else
      -		strbuf_addstr(&state->stack->output, v->s);
     -+		if (v->s_size != ATOM_VALUE_S_SIZE_INIT)
     ++		if (v->s_size != -1)
      +			strbuf_add(&state->stack->output, v->s, v->s_size);
      +		else
      +			strbuf_addstr(&state->stack->output, v->s);
     @@ ref-filter.c: static int populate_value(struct ref_array_item *ref, struct strbu
       		const char *refname;
       		struct branch *branch = NULL;
       
     -+		v->s_size = ATOM_VALUE_S_SIZE_INIT;
     ++		v->s_size = -1;
       		v->handler = append_atom;
       		v->atom = atom;
       
     @@ ref-filter.c: static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array
      -		cmp_fn = s->sort_flags & REF_SORTING_ICASE
      -			? strcasecmp : strcmp;
      -		cmp = cmp_fn(va->s, vb->s);
     -+		if (va->s_size == ATOM_VALUE_S_SIZE_INIT &&
     -+		    vb->s_size == ATOM_VALUE_S_SIZE_INIT) {
     ++		if (va->s_size == -1 && vb->s_size == -1) {
      +			int (*cmp_fn)(const char *, const char *);
      +			cmp_fn = s->sort_flags & REF_SORTING_ICASE
      +				? strcasecmp : strcmp;
      +			cmp = cmp_fn(va->s, vb->s);
      +		} else {
     -+			size_t a_size = va->s_size == ATOM_VALUE_S_SIZE_INIT ?
     ++			size_t a_size = va->s_size == -1 ?
      +					strlen(va->s) : va->s_size;
     -+			size_t b_size = vb->s_size == ATOM_VALUE_S_SIZE_INIT ?
     ++			size_t b_size = vb->s_size == -1 ?
      +					strlen(vb->s) : vb->s_size;
      +			int (*cmp_fn)(const void *, const void *, size_t);
      +			cmp_fn = s->sort_flags & REF_SORTING_ICASE
     @@ ref-filter.c: static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array
       		if (va->value < vb->value)
       			cmp = -1;
      @@ ref-filter.c: int format_ref_array_item(struct ref_array_item *info,
     + 		append_literal(cp, sp, &state);
       	}
       	if (format->need_color_reset_at_eol) {
     - 		struct atom_value resetv;
     -+		resetv.s_size = ATOM_VALUE_S_SIZE_INIT;
     +-		struct atom_value resetv;
     ++		struct atom_value resetv = ATOM_VALUE_INIT;
       		resetv.s = GIT_COLOR_RESET;
       		if (append_atom(&resetv, &state, error_buf)) {
       			pop_stack_element(&state.stack);
 3:  b6438fd4308 ! 3:  00a29dd16d7 [GSOC] ref-filter: --format=%(raw) re-support --perl
     @@ ref-filter.c: static void quote_formatting(struct strbuf *s, const char *str, si
       		break;
       	case QUOTE_PERL:
      -		perl_quote_buf(s, str);
     -+		if (len != ATOM_VALUE_S_SIZE_INIT)
     ++		if (len != -1)
      +			perl_quote_buf_with_len(s, str, len);
      +		else
      +			perl_quote_buf(s, str);
 4:  33d332c664e = 4:  3a8173b42ed [GSOC] ref-filter: use non-const ref_format in *_atom_parser()
 5:  75eb2f6740e = 5:  71249dfd4e0 [GSOC] ref-filter: add %(rest) atom

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2 1/5] [GSOC] ref-filter: add obj-type check in grab contents
  2021-07-23  9:03 ` [PATCH v2 " ZheNing Hu via GitGitGadget
@ 2021-07-23  9:03   ` ZheNing Hu via GitGitGadget
  2021-07-23  9:04   ` [PATCH v2 2/5] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 45+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-07-23  9:03 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
	Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Only tag and commit objects use `grab_sub_body_contents()` to grab
object contents in the current codebase.  We want to teach the
function to also handle blobs and trees to get their raw data,
without parsing a blob (whose contents looks like a commit or a tag)
incorrectly as a commit or a tag. So it's needed to pass a
`struct expand_data *data` instread of only `void *buf` to both
`grab_sub_body_contents()` and `grab_values()` to be able to check
the object type.

Skip the block of code that is specific to handling commits and tags
early when the given object is of a wrong type to help later
addition to handle other types of objects in this function.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 ref-filter.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 4db0e40ff4c..5cee6512fba 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1356,11 +1356,12 @@ static void append_lines(struct strbuf *out, const char *buf, unsigned long size
 }
 
 /* See grab_values */
-static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
+static void grab_sub_body_contents(struct atom_value *val, int deref, struct expand_data *data)
 {
 	int i;
 	const char *subpos = NULL, *bodypos = NULL, *sigpos = NULL;
 	size_t sublen = 0, bodylen = 0, nonsiglen = 0, siglen = 0;
+	void *buf = data->content;
 
 	for (i = 0; i < used_atom_cnt; i++) {
 		struct used_atom *atom = &used_atom[i];
@@ -1371,10 +1372,13 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
 			continue;
 		if (deref)
 			name++;
-		if (strcmp(name, "body") &&
-		    !starts_with(name, "subject") &&
-		    !starts_with(name, "trailers") &&
-		    !starts_with(name, "contents"))
+
+		if ((data->type != OBJ_TAG &&
+		     data->type != OBJ_COMMIT) ||
+		    (strcmp(name, "body") &&
+		     !starts_with(name, "subject") &&
+		     !starts_with(name, "trailers") &&
+		     !starts_with(name, "contents")))
 			continue;
 		if (!subpos)
 			find_subpos(buf,
@@ -1438,17 +1442,19 @@ static void fill_missing_values(struct atom_value *val)
  * pointed at by the ref itself; otherwise it is the object the
  * ref (which is a tag) refers to.
  */
-static void grab_values(struct atom_value *val, int deref, struct object *obj, void *buf)
+static void grab_values(struct atom_value *val, int deref, struct object *obj, struct expand_data *data)
 {
+	void *buf = data->content;
+
 	switch (obj->type) {
 	case OBJ_TAG:
 		grab_tag_values(val, deref, obj);
-		grab_sub_body_contents(val, deref, buf);
+		grab_sub_body_contents(val, deref, data);
 		grab_person("tagger", val, deref, buf);
 		break;
 	case OBJ_COMMIT:
 		grab_commit_values(val, deref, obj);
-		grab_sub_body_contents(val, deref, buf);
+		grab_sub_body_contents(val, deref, data);
 		grab_person("author", val, deref, buf);
 		grab_person("committer", val, deref, buf);
 		break;
@@ -1678,7 +1684,7 @@ static int get_object(struct ref_array_item *ref, int deref, struct object **obj
 			return strbuf_addf_ret(err, -1, _("parse_object_buffer failed on %s for %s"),
 					       oid_to_hex(&oi->oid), ref->refname);
 		}
-		grab_values(ref->value, deref, *obj, oi->content);
+		grab_values(ref->value, deref, *obj, oi);
 	}
 
 	grab_common_values(ref->value, deref, oi);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 2/5] [GSOC] ref-filter: add %(raw) atom
  2021-07-23  9:03 ` [PATCH v2 " ZheNing Hu via GitGitGadget
  2021-07-23  9:03   ` [PATCH v2 1/5] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
@ 2021-07-23  9:04   ` ZheNing Hu via GitGitGadget
  2021-07-23 16:38     ` Junio C Hamano
  2021-07-23  9:04   ` [PATCH v2 3/5] [GSOC] ref-filter: --format=%(raw) re-support --perl ZheNing Hu via GitGitGadget
                     ` (3 subsequent siblings)
  5 siblings, 1 reply; 45+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-07-23  9:04 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
	Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Add new formatting option `%(raw)`, which will print the raw
object data without any changes. It will help further to migrate
all cat-file formatting logic from cat-file to ref-filter.

The raw data of blob, tree objects may contain '\0', but most of
the logic in `ref-filter` depends on the output of the atom being
text (specifically, no embedded NULs in it).

E.g. `quote_formatting()` use `strbuf_addstr()` or `*._quote_buf()`
add the data to the buffer. The raw data of a tree object is
`100644 one\0...`, only the `100644 one` will be added to the buffer,
which is incorrect.

Therefore, we need to find a way to record the length of the
atom_value's member `s`. Although strbuf can already record the
string and its length, if we want to replace the type of atom_value's
member `s` with strbuf, many places in ref-filter that are filled
with dynamically allocated mermory in `v->s` are not easy to replace.
At the same time, we need to check if `v->s == NULL` in
populate_value(), and strbuf cannot easily distinguish NULL and empty
strings, but c-style "const char *" can do it. So add a new member in
`struct atom_value`: `s_size`, which can record raw object size, it
can help us add raw object data to the buffer or compare two buffers
which contain raw object data.

Note that `--format=%(raw)` cannot be used with `--python`, `--shell`,
`--tcl`, and `--perl` because if the binary raw data is passed to a
variable in such languages, these may not support arbitrary binary data
in their string variable type.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Helped-by: Bagas Sanjaya <bagasdotme@gmail.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Felipe Contreras <felipe.contreras@gmail.com>
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Based-on-patch-by: Olga Telezhnaya <olyatelezhnaya@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 Documentation/git-for-each-ref.txt |   9 ++
 ref-filter.c                       | 142 +++++++++++++++----
 t/t6300-for-each-ref.sh            | 216 +++++++++++++++++++++++++++++
 3 files changed, 339 insertions(+), 28 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index 2ae2478de70..cbb6f87d13f 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -235,6 +235,15 @@ and `date` to extract the named component.  For email fields (`authoremail`,
 without angle brackets, and `:localpart` to get the part before the `@` symbol
 out of the trimmed email.
 
+The raw data in an object is `raw`.
+
+raw:size::
+	The raw data size of the object.
+
+Note that `--format=%(raw)` can not be used with `--python`, `--shell`, `--tcl`,
+`--perl` because such language may not support arbitrary binary data in their
+string variable type.
+
 The message in a commit or a tag object is `contents`, from which
 `contents:<part>` can be used to extract various parts out of:
 
diff --git a/ref-filter.c b/ref-filter.c
index 5cee6512fba..a713507d783 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -144,6 +144,7 @@ enum atom_type {
 	ATOM_BODY,
 	ATOM_TRAILERS,
 	ATOM_CONTENTS,
+	ATOM_RAW,
 	ATOM_UPSTREAM,
 	ATOM_PUSH,
 	ATOM_SYMREF,
@@ -189,6 +190,9 @@ static struct used_atom {
 			struct process_trailer_options trailer_opts;
 			unsigned int nlines;
 		} contents;
+		struct {
+			enum { RAW_BARE, RAW_LENGTH } option;
+		} raw_data;
 		struct {
 			cmp_status cmp_status;
 			const char *str;
@@ -426,6 +430,18 @@ static int contents_atom_parser(const struct ref_format *format, struct used_ato
 	return 0;
 }
 
+static int raw_atom_parser(const struct ref_format *format, struct used_atom *atom,
+				const char *arg, struct strbuf *err)
+{
+	if (!arg)
+		atom->u.raw_data.option = RAW_BARE;
+	else if (!strcmp(arg, "size"))
+		atom->u.raw_data.option = RAW_LENGTH;
+	else
+		return strbuf_addf_ret(err, -1, _("unrecognized %%(raw) argument: %s"), arg);
+	return 0;
+}
+
 static int oid_atom_parser(const struct ref_format *format, struct used_atom *atom,
 			   const char *arg, struct strbuf *err)
 {
@@ -586,6 +602,7 @@ static struct {
 	[ATOM_BODY] = { "body", SOURCE_OBJ, FIELD_STR, body_atom_parser },
 	[ATOM_TRAILERS] = { "trailers", SOURCE_OBJ, FIELD_STR, trailers_atom_parser },
 	[ATOM_CONTENTS] = { "contents", SOURCE_OBJ, FIELD_STR, contents_atom_parser },
+	[ATOM_RAW] = { "raw", SOURCE_OBJ, FIELD_STR, raw_atom_parser },
 	[ATOM_UPSTREAM] = { "upstream", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
 	[ATOM_PUSH] = { "push", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
 	[ATOM_SYMREF] = { "symref", SOURCE_NONE, FIELD_STR, refname_atom_parser },
@@ -620,12 +637,17 @@ struct ref_formatting_state {
 
 struct atom_value {
 	const char *s;
+	size_t s_size;
 	int (*handler)(struct atom_value *atomv, struct ref_formatting_state *state,
 		       struct strbuf *err);
 	uintmax_t value; /* used for sorting when not FIELD_STR */
 	struct used_atom *atom;
 };
 
+#define ATOM_VALUE_INIT { \
+	.s_size = -1 \
+}
+
 /*
  * Used to parse format string and sort specifiers
  */
@@ -644,13 +666,6 @@ static int parse_ref_filter_atom(const struct ref_format *format,
 		return strbuf_addf_ret(err, -1, _("malformed field name: %.*s"),
 				       (int)(ep-atom), atom);
 
-	/* Do we have the atom already used elsewhere? */
-	for (i = 0; i < used_atom_cnt; i++) {
-		int len = strlen(used_atom[i].name);
-		if (len == ep - atom && !memcmp(used_atom[i].name, atom, len))
-			return i;
-	}
-
 	/*
 	 * If the atom name has a colon, strip it and everything after
 	 * it off - it specifies the format for this entry, and
@@ -660,6 +675,13 @@ static int parse_ref_filter_atom(const struct ref_format *format,
 	arg = memchr(sp, ':', ep - sp);
 	atom_len = (arg ? arg : ep) - sp;
 
+	/* Do we have the atom already used elsewhere? */
+	for (i = 0; i < used_atom_cnt; i++) {
+		int len = strlen(used_atom[i].name);
+		if (len == ep - atom && !memcmp(used_atom[i].name, atom, len))
+			return i;
+	}
+
 	/* Is the atom a valid one? */
 	for (i = 0; i < ARRAY_SIZE(valid_atom); i++) {
 		int len = strlen(valid_atom[i].name);
@@ -709,11 +731,14 @@ static int parse_ref_filter_atom(const struct ref_format *format,
 	return at;
 }
 
-static void quote_formatting(struct strbuf *s, const char *str, int quote_style)
+static void quote_formatting(struct strbuf *s, const char *str, size_t len, int quote_style)
 {
 	switch (quote_style) {
 	case QUOTE_NONE:
-		strbuf_addstr(s, str);
+		if (len != -1)
+			strbuf_add(s, str, len);
+		else
+			strbuf_addstr(s, str);
 		break;
 	case QUOTE_SHELL:
 		sq_quote_buf(s, str);
@@ -740,9 +765,12 @@ static int append_atom(struct atom_value *v, struct ref_formatting_state *state,
 	 * encountered.
 	 */
 	if (!state->stack->prev)
-		quote_formatting(&state->stack->output, v->s, state->quote_style);
+		quote_formatting(&state->stack->output, v->s, v->s_size, state->quote_style);
 	else
-		strbuf_addstr(&state->stack->output, v->s);
+		if (v->s_size != -1)
+			strbuf_add(&state->stack->output, v->s, v->s_size);
+		else
+			strbuf_addstr(&state->stack->output, v->s);
 	return 0;
 }
 
@@ -842,21 +870,23 @@ static int if_atom_handler(struct atom_value *atomv, struct ref_formatting_state
 	return 0;
 }
 
-static int is_empty(const char *s)
+static int is_empty(struct strbuf *buf)
 {
-	while (*s != '\0') {
-		if (!isspace(*s))
-			return 0;
-		s++;
-	}
-	return 1;
-}
+	const char *cur = buf->buf;
+	const char *end = buf->buf + buf->len;
+
+	while (cur != end && (isspace(*cur)))
+		cur++;
+
+	return cur == end;
+ }
 
 static int then_atom_handler(struct atom_value *atomv, struct ref_formatting_state *state,
 			     struct strbuf *err)
 {
 	struct ref_formatting_stack *cur = state->stack;
 	struct if_then_else *if_then_else = NULL;
+	size_t str_len = 0;
 
 	if (cur->at_end == if_then_else_handler)
 		if_then_else = (struct if_then_else *)cur->at_end_data;
@@ -867,18 +897,22 @@ static int then_atom_handler(struct atom_value *atomv, struct ref_formatting_sta
 	if (if_then_else->else_atom_seen)
 		return strbuf_addf_ret(err, -1, _("format: %%(then) atom used after %%(else)"));
 	if_then_else->then_atom_seen = 1;
+	if (if_then_else->str)
+		str_len = strlen(if_then_else->str);
 	/*
 	 * If the 'equals' or 'notequals' attribute is used then
 	 * perform the required comparison. If not, only non-empty
 	 * strings satisfy the 'if' condition.
 	 */
 	if (if_then_else->cmp_status == COMPARE_EQUAL) {
-		if (!strcmp(if_then_else->str, cur->output.buf))
+		if (str_len == cur->output.len &&
+		    !memcmp(if_then_else->str, cur->output.buf, cur->output.len))
 			if_then_else->condition_satisfied = 1;
 	} else if (if_then_else->cmp_status == COMPARE_UNEQUAL) {
-		if (strcmp(if_then_else->str, cur->output.buf))
+		if (str_len != cur->output.len ||
+		    memcmp(if_then_else->str, cur->output.buf, cur->output.len))
 			if_then_else->condition_satisfied = 1;
-	} else if (cur->output.len && !is_empty(cur->output.buf))
+	} else if (cur->output.len && !is_empty(&cur->output))
 		if_then_else->condition_satisfied = 1;
 	strbuf_reset(&cur->output);
 	return 0;
@@ -924,7 +958,7 @@ static int end_atom_handler(struct atom_value *atomv, struct ref_formatting_stat
 	 * only on the topmost supporting atom.
 	 */
 	if (!current->prev->prev) {
-		quote_formatting(&s, current->output.buf, state->quote_style);
+		quote_formatting(&s, current->output.buf, current->output.len, state->quote_style);
 		strbuf_swap(&current->output, &s);
 	}
 	strbuf_release(&s);
@@ -974,6 +1008,10 @@ int verify_ref_format(struct ref_format *format)
 		at = parse_ref_filter_atom(format, sp + 2, ep, &err);
 		if (at < 0)
 			die("%s", err.buf);
+		if (format->quote_style && used_atom[at].atom_type == ATOM_RAW &&
+		    used_atom[at].u.raw_data.option == RAW_BARE)
+			die(_("--format=%.*s cannot be used with"
+			      "--python, --shell, --tcl, --perl"), (int)(ep - sp - 2), sp + 2);
 		cp = ep + 1;
 
 		if (skip_prefix(used_atom[at].name, "color:", &color))
@@ -1367,12 +1405,25 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, struct exp
 		struct used_atom *atom = &used_atom[i];
 		const char *name = atom->name;
 		struct atom_value *v = &val[i];
+		enum atom_type atom_type = atom->atom_type;
 
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
 
+		if (atom_type == ATOM_RAW) {
+			unsigned long buf_size = data->size;
+
+			if (atom->u.raw_data.option == RAW_BARE) {
+				v->s = xmemdupz(buf, buf_size);
+				v->s_size = buf_size;
+			} else if (atom->u.raw_data.option == RAW_LENGTH) {
+				v->s = xstrfmt("%"PRIuMAX, (uintmax_t)buf_size);
+			}
+			continue;
+		}
+
 		if ((data->type != OBJ_TAG &&
 		     data->type != OBJ_COMMIT) ||
 		    (strcmp(name, "body") &&
@@ -1460,9 +1511,11 @@ static void grab_values(struct atom_value *val, int deref, struct object *obj, s
 		break;
 	case OBJ_TREE:
 		/* grab_tree_values(val, deref, obj, buf, sz); */
+		grab_sub_body_contents(val, deref, data);
 		break;
 	case OBJ_BLOB:
 		/* grab_blob_values(val, deref, obj, buf, sz); */
+		grab_sub_body_contents(val, deref, data);
 		break;
 	default:
 		die("Eh?  Object of type %d?", obj->type);
@@ -1766,6 +1819,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 		const char *refname;
 		struct branch *branch = NULL;
 
+		v->s_size = -1;
 		v->handler = append_atom;
 		v->atom = atom;
 
@@ -2369,6 +2423,19 @@ static int compare_detached_head(struct ref_array_item *a, struct ref_array_item
 	return 0;
 }
 
+static int memcasecmp(const void *vs1, const void *vs2, size_t n)
+{
+	const char *s1 = vs1, *s2 = vs2;
+	const char *end = s1 + n;
+
+	for (; s1 < end; s1++, s2++) {
+		int diff = tolower(*s1) - tolower(*s2);
+		if (diff)
+			return diff;
+	}
+	return 0;
+}
+
 static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, struct ref_array_item *b)
 {
 	struct atom_value *va, *vb;
@@ -2389,10 +2456,29 @@ static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, stru
 	} else if (s->sort_flags & REF_SORTING_VERSION) {
 		cmp = versioncmp(va->s, vb->s);
 	} else if (cmp_type == FIELD_STR) {
-		int (*cmp_fn)(const char *, const char *);
-		cmp_fn = s->sort_flags & REF_SORTING_ICASE
-			? strcasecmp : strcmp;
-		cmp = cmp_fn(va->s, vb->s);
+		if (va->s_size == -1 && vb->s_size == -1) {
+			int (*cmp_fn)(const char *, const char *);
+			cmp_fn = s->sort_flags & REF_SORTING_ICASE
+				? strcasecmp : strcmp;
+			cmp = cmp_fn(va->s, vb->s);
+		} else {
+			size_t a_size = va->s_size == -1 ?
+					strlen(va->s) : va->s_size;
+			size_t b_size = vb->s_size == -1 ?
+					strlen(vb->s) : vb->s_size;
+			int (*cmp_fn)(const void *, const void *, size_t);
+			cmp_fn = s->sort_flags & REF_SORTING_ICASE
+				? memcasecmp : memcmp;
+
+			cmp = cmp_fn(va->s, vb->s, b_size > a_size ?
+				     a_size : b_size);
+			if (!cmp) {
+				if (a_size > b_size)
+					cmp = 1;
+				else if (a_size < b_size)
+					cmp = -1;
+			}
+		}
 	} else {
 		if (va->value < vb->value)
 			cmp = -1;
@@ -2491,7 +2577,7 @@ int format_ref_array_item(struct ref_array_item *info,
 		append_literal(cp, sp, &state);
 	}
 	if (format->need_color_reset_at_eol) {
-		struct atom_value resetv;
+		struct atom_value resetv = ATOM_VALUE_INIT;
 		resetv.s = GIT_COLOR_RESET;
 		if (append_atom(&resetv, &state, error_buf)) {
 			pop_stack_element(&state.stack);
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 9e0214076b4..18554f62d94 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -130,6 +130,8 @@ test_atom head parent:short=10 ''
 test_atom head numparent 0
 test_atom head object ''
 test_atom head type ''
+test_atom head raw "$(git cat-file commit refs/heads/main)
+"
 test_atom head '*objectname' ''
 test_atom head '*objecttype' ''
 test_atom head author 'A U Thor <author@example.com> 1151968724 +0200'
@@ -221,6 +223,15 @@ test_atom tag contents 'Tagging at 1151968727
 '
 test_atom tag HEAD ' '
 
+test_expect_success 'basic atom: refs/tags/testtag *raw' '
+	git cat-file commit refs/tags/testtag^{} >expected &&
+	git for-each-ref --format="%(*raw)" refs/tags/testtag >actual &&
+	sanitize_pgp <expected >expected.clean &&
+	echo >>expected.clean &&
+	sanitize_pgp <actual >actual.clean &&
+	test_cmp expected.clean actual.clean
+'
+
 test_expect_success 'Check invalid atoms names are errors' '
 	test_must_fail git for-each-ref --format="%(INVALID)" refs/heads
 '
@@ -686,6 +697,15 @@ test_atom refs/tags/signed-empty contents:body ''
 test_atom refs/tags/signed-empty contents:signature "$sig"
 test_atom refs/tags/signed-empty contents "$sig"
 
+test_expect_success GPG 'basic atom: refs/tags/signed-empty raw' '
+	git cat-file tag refs/tags/signed-empty >expected &&
+	git for-each-ref --format="%(raw)" refs/tags/signed-empty >actual &&
+	sanitize_pgp <expected >expected.clean &&
+	echo >>expected.clean &&
+	sanitize_pgp <actual >actual.clean &&
+	test_cmp expected.clean actual.clean
+'
+
 test_atom refs/tags/signed-short subject 'subject line'
 test_atom refs/tags/signed-short subject:sanitize 'subject-line'
 test_atom refs/tags/signed-short contents:subject 'subject line'
@@ -695,6 +715,15 @@ test_atom refs/tags/signed-short contents:signature "$sig"
 test_atom refs/tags/signed-short contents "subject line
 $sig"
 
+test_expect_success GPG 'basic atom: refs/tags/signed-short raw' '
+	git cat-file tag refs/tags/signed-short >expected &&
+	git for-each-ref --format="%(raw)" refs/tags/signed-short >actual &&
+	sanitize_pgp <expected >expected.clean &&
+	echo >>expected.clean &&
+	sanitize_pgp <actual >actual.clean &&
+	test_cmp expected.clean actual.clean
+'
+
 test_atom refs/tags/signed-long subject 'subject line'
 test_atom refs/tags/signed-long subject:sanitize 'subject-line'
 test_atom refs/tags/signed-long contents:subject 'subject line'
@@ -708,6 +737,15 @@ test_atom refs/tags/signed-long contents "subject line
 body contents
 $sig"
 
+test_expect_success GPG 'basic atom: refs/tags/signed-long raw' '
+	git cat-file tag refs/tags/signed-long >expected &&
+	git for-each-ref --format="%(raw)" refs/tags/signed-long >actual &&
+	sanitize_pgp <expected >expected.clean &&
+	echo >>expected.clean &&
+	sanitize_pgp <actual >actual.clean &&
+	test_cmp expected.clean actual.clean
+'
+
 test_expect_success 'set up refs pointing to tree and blob' '
 	git update-ref refs/mytrees/first refs/heads/main^{tree} &&
 	git update-ref refs/myblobs/first refs/heads/main:one
@@ -720,6 +758,16 @@ test_atom refs/mytrees/first contents:body ""
 test_atom refs/mytrees/first contents:signature ""
 test_atom refs/mytrees/first contents ""
 
+test_expect_success 'basic atom: refs/mytrees/first raw' '
+	git cat-file tree refs/mytrees/first >expected &&
+	echo >>expected &&
+	git for-each-ref --format="%(raw)" refs/mytrees/first >actual &&
+	test_cmp expected actual &&
+	git cat-file -s refs/mytrees/first >expected &&
+	git for-each-ref --format="%(raw:size)" refs/mytrees/first >actual &&
+	test_cmp expected actual
+'
+
 test_atom refs/myblobs/first subject ""
 test_atom refs/myblobs/first contents:subject ""
 test_atom refs/myblobs/first body ""
@@ -727,6 +775,174 @@ test_atom refs/myblobs/first contents:body ""
 test_atom refs/myblobs/first contents:signature ""
 test_atom refs/myblobs/first contents ""
 
+test_expect_success 'basic atom: refs/myblobs/first raw' '
+	git cat-file blob refs/myblobs/first >expected &&
+	echo >>expected &&
+	git for-each-ref --format="%(raw)" refs/myblobs/first >actual &&
+	test_cmp expected actual &&
+	git cat-file -s refs/myblobs/first >expected &&
+	git for-each-ref --format="%(raw:size)" refs/myblobs/first >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'set up refs pointing to binary blob' '
+	printf "a\0b\0c" >blob1 &&
+	printf "a\0c\0b" >blob2 &&
+	printf "\0a\0b\0c" >blob3 &&
+	printf "abc" >blob4 &&
+	printf "\0 \0 \0 " >blob5 &&
+	printf "\0 \0a\0 " >blob6 &&
+	printf "  " >blob7 &&
+	>blob8 &&
+	obj=$(git hash-object -w blob1) &&
+	git update-ref refs/myblobs/blob1 "$obj" &&
+	obj=$(git hash-object -w blob2) &&
+	git update-ref refs/myblobs/blob2 "$obj" &&
+	obj=$(git hash-object -w blob3) &&
+	git update-ref refs/myblobs/blob3 "$obj" &&
+	obj=$(git hash-object -w blob4) &&
+	git update-ref refs/myblobs/blob4 "$obj" &&
+	obj=$(git hash-object -w blob5) &&
+	git update-ref refs/myblobs/blob5 "$obj" &&
+	obj=$(git hash-object -w blob6) &&
+	git update-ref refs/myblobs/blob6 "$obj" &&
+	obj=$(git hash-object -w blob7) &&
+	git update-ref refs/myblobs/blob7 "$obj" &&
+	obj=$(git hash-object -w blob8) &&
+	git update-ref refs/myblobs/blob8 "$obj"
+'
+
+test_expect_success 'Verify sorts with raw' '
+	cat >expected <<-EOF &&
+	refs/myblobs/blob8
+	refs/myblobs/blob5
+	refs/myblobs/blob6
+	refs/myblobs/blob3
+	refs/myblobs/blob7
+	refs/mytrees/first
+	refs/myblobs/first
+	refs/myblobs/blob1
+	refs/myblobs/blob2
+	refs/myblobs/blob4
+	refs/heads/main
+	EOF
+	git for-each-ref --format="%(refname)" --sort=raw \
+		refs/heads/main refs/myblobs/ refs/mytrees/first >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'Verify sorts with raw:size' '
+	cat >expected <<-EOF &&
+	refs/myblobs/blob8
+	refs/myblobs/first
+	refs/myblobs/blob7
+	refs/heads/main
+	refs/myblobs/blob4
+	refs/myblobs/blob1
+	refs/myblobs/blob2
+	refs/myblobs/blob3
+	refs/myblobs/blob5
+	refs/myblobs/blob6
+	refs/mytrees/first
+	EOF
+	git for-each-ref --format="%(refname)" --sort=raw:size \
+		refs/heads/main refs/myblobs/ refs/mytrees/first >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'validate raw atom with %(if:equals)' '
+	cat >expected <<-EOF &&
+	not equals
+	not equals
+	not equals
+	not equals
+	not equals
+	not equals
+	refs/myblobs/blob4
+	not equals
+	not equals
+	not equals
+	not equals
+	not equals
+	EOF
+	git for-each-ref --format="%(if:equals=abc)%(raw)%(then)%(refname)%(else)not equals%(end)" \
+		refs/myblobs/ refs/heads/ >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'validate raw atom with %(if:notequals)' '
+	cat >expected <<-EOF &&
+	refs/heads/ambiguous
+	refs/heads/main
+	refs/heads/newtag
+	refs/myblobs/blob1
+	refs/myblobs/blob2
+	refs/myblobs/blob3
+	equals
+	refs/myblobs/blob5
+	refs/myblobs/blob6
+	refs/myblobs/blob7
+	refs/myblobs/blob8
+	refs/myblobs/first
+	EOF
+	git for-each-ref --format="%(if:notequals=abc)%(raw)%(then)%(refname)%(else)equals%(end)" \
+		refs/myblobs/ refs/heads/ >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'empty raw refs with %(if)' '
+	cat >expected <<-EOF &&
+	refs/myblobs/blob1 not empty
+	refs/myblobs/blob2 not empty
+	refs/myblobs/blob3 not empty
+	refs/myblobs/blob4 not empty
+	refs/myblobs/blob5 not empty
+	refs/myblobs/blob6 not empty
+	refs/myblobs/blob7 empty
+	refs/myblobs/blob8 empty
+	refs/myblobs/first not empty
+	EOF
+	git for-each-ref --format="%(refname) %(if)%(raw)%(then)not empty%(else)empty%(end)" \
+		refs/myblobs/ >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success '%(raw) with --python must fail' '
+	test_must_fail git for-each-ref --format="%(raw)" --python
+'
+
+test_expect_success '%(raw) with --tcl must fail' '
+	test_must_fail git for-each-ref --format="%(raw)" --tcl
+'
+
+test_expect_success '%(raw) with --perl must fail' '
+	test_must_fail git for-each-ref --format="%(raw)" --perl
+'
+
+test_expect_success '%(raw) with --shell must fail' '
+	test_must_fail git for-each-ref --format="%(raw)" --shell
+'
+
+test_expect_success '%(raw) with --shell and --sort=raw must fail' '
+	test_must_fail git for-each-ref --format="%(raw)" --sort=raw --shell
+'
+
+test_expect_success '%(raw:size) with --shell' '
+	git for-each-ref --format="%(raw:size)" | while read line
+	do
+		echo "'\''$line'\''" >>expect
+	done &&
+	git for-each-ref --format="%(raw:size)" --shell >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'for-each-ref --format compare with cat-file --batch' '
+	git rev-parse refs/mytrees/first | git cat-file --batch >expected &&
+	git for-each-ref --format="%(objectname) %(objecttype) %(objectsize)
+%(raw)" refs/mytrees/first >actual &&
+	test_cmp expected actual
+'
+
 test_expect_success 'set up multiple-sort tags' '
 	for when in 100000 200000
 	do
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 3/5] [GSOC] ref-filter: --format=%(raw) re-support --perl
  2021-07-23  9:03 ` [PATCH v2 " ZheNing Hu via GitGitGadget
  2021-07-23  9:03   ` [PATCH v2 1/5] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
  2021-07-23  9:04   ` [PATCH v2 2/5] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
@ 2021-07-23  9:04   ` ZheNing Hu via GitGitGadget
  2021-07-23  9:04   ` [PATCH v2 4/5] [GSOC] ref-filter: use non-const ref_format in *_atom_parser() ZheNing Hu via GitGitGadget
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 45+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-07-23  9:04 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
	Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Because the perl language can handle binary data correctly,
add the function perl_quote_buf_with_len(), which can specify
the length of the data and prevent the data from being truncated
at '\0' to help `--format="%(raw)"` re-support `--perl`.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 Documentation/git-for-each-ref.txt |  4 ++--
 quote.c                            | 17 +++++++++++++++++
 quote.h                            |  1 +
 ref-filter.c                       | 15 +++++++++++----
 t/t6300-for-each-ref.sh            | 19 +++++++++++++++++--
 5 files changed, 48 insertions(+), 8 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index cbb6f87d13f..6da899c6296 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -241,8 +241,8 @@ raw:size::
 	The raw data size of the object.
 
 Note that `--format=%(raw)` can not be used with `--python`, `--shell`, `--tcl`,
-`--perl` because such language may not support arbitrary binary data in their
-string variable type.
+because such language may not support arbitrary binary data in their string
+variable type.
 
 The message in a commit or a tag object is `contents`, from which
 `contents:<part>` can be used to extract various parts out of:
diff --git a/quote.c b/quote.c
index 8a3a5e39eb1..26719d21d1e 100644
--- a/quote.c
+++ b/quote.c
@@ -471,6 +471,23 @@ void perl_quote_buf(struct strbuf *sb, const char *src)
 	strbuf_addch(sb, sq);
 }
 
+void perl_quote_buf_with_len(struct strbuf *sb, const char *src, size_t len)
+{
+	const char sq = '\'';
+	const char bq = '\\';
+	const char *c = src;
+	const char *end = src + len;
+
+	strbuf_addch(sb, sq);
+	while (c != end) {
+		if (*c == sq || *c == bq)
+			strbuf_addch(sb, bq);
+		strbuf_addch(sb, *c);
+		c++;
+	}
+	strbuf_addch(sb, sq);
+}
+
 void python_quote_buf(struct strbuf *sb, const char *src)
 {
 	const char sq = '\'';
diff --git a/quote.h b/quote.h
index 768cc6338e2..0fe69e264b0 100644
--- a/quote.h
+++ b/quote.h
@@ -94,6 +94,7 @@ char *quote_path(const char *in, const char *prefix, struct strbuf *out, unsigne
 
 /* quoting as a string literal for other languages */
 void perl_quote_buf(struct strbuf *sb, const char *src);
+void perl_quote_buf_with_len(struct strbuf *sb, const char *src, size_t len);
 void python_quote_buf(struct strbuf *sb, const char *src);
 void tcl_quote_buf(struct strbuf *sb, const char *src);
 void basic_regex_quote_buf(struct strbuf *sb, const char *src);
diff --git a/ref-filter.c b/ref-filter.c
index a713507d783..0ff8574a00c 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -744,7 +744,10 @@ static void quote_formatting(struct strbuf *s, const char *str, size_t len, int
 		sq_quote_buf(s, str);
 		break;
 	case QUOTE_PERL:
-		perl_quote_buf(s, str);
+		if (len != -1)
+			perl_quote_buf_with_len(s, str, len);
+		else
+			perl_quote_buf(s, str);
 		break;
 	case QUOTE_PYTHON:
 		python_quote_buf(s, str);
@@ -1008,10 +1011,14 @@ int verify_ref_format(struct ref_format *format)
 		at = parse_ref_filter_atom(format, sp + 2, ep, &err);
 		if (at < 0)
 			die("%s", err.buf);
-		if (format->quote_style && used_atom[at].atom_type == ATOM_RAW &&
-		    used_atom[at].u.raw_data.option == RAW_BARE)
+
+		if ((format->quote_style == QUOTE_PYTHON ||
+		     format->quote_style == QUOTE_SHELL ||
+		     format->quote_style == QUOTE_TCL) &&
+		     used_atom[at].atom_type == ATOM_RAW &&
+		     used_atom[at].u.raw_data.option == RAW_BARE)
 			die(_("--format=%.*s cannot be used with"
-			      "--python, --shell, --tcl, --perl"), (int)(ep - sp - 2), sp + 2);
+			      "--python, --shell, --tcl"), (int)(ep - sp - 2), sp + 2);
 		cp = ep + 1;
 
 		if (skip_prefix(used_atom[at].name, "color:", &color))
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 18554f62d94..3d15d0a5360 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -915,8 +915,23 @@ test_expect_success '%(raw) with --tcl must fail' '
 	test_must_fail git for-each-ref --format="%(raw)" --tcl
 '
 
-test_expect_success '%(raw) with --perl must fail' '
-	test_must_fail git for-each-ref --format="%(raw)" --perl
+test_expect_success '%(raw) with --perl' '
+	git for-each-ref --format="\$name= %(raw);
+print \"\$name\"" refs/myblobs/blob1 --perl | perl >actual &&
+	cmp blob1 actual &&
+	git for-each-ref --format="\$name= %(raw);
+print \"\$name\"" refs/myblobs/blob3 --perl | perl >actual &&
+	cmp blob3 actual &&
+	git for-each-ref --format="\$name= %(raw);
+print \"\$name\"" refs/myblobs/blob8 --perl | perl >actual &&
+	cmp blob8 actual &&
+	git for-each-ref --format="\$name= %(raw);
+print \"\$name\"" refs/myblobs/first --perl | perl >actual &&
+	cmp one actual &&
+	git cat-file tree refs/mytrees/first > expected &&
+	git for-each-ref --format="\$name= %(raw);
+print \"\$name\"" refs/mytrees/first --perl | perl >actual &&
+	cmp expected actual
 '
 
 test_expect_success '%(raw) with --shell must fail' '
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 4/5] [GSOC] ref-filter: use non-const ref_format in *_atom_parser()
  2021-07-23  9:03 ` [PATCH v2 " ZheNing Hu via GitGitGadget
                     ` (2 preceding siblings ...)
  2021-07-23  9:04   ` [PATCH v2 3/5] [GSOC] ref-filter: --format=%(raw) re-support --perl ZheNing Hu via GitGitGadget
@ 2021-07-23  9:04   ` ZheNing Hu via GitGitGadget
  2021-07-23  9:04   ` [PATCH v2 5/5] [GSOC] ref-filter: add %(rest) atom ZheNing Hu via GitGitGadget
  2021-07-24 14:14   ` [PATCH v3 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms ZheNing Hu via GitGitGadget
  5 siblings, 0 replies; 45+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-07-23  9:04 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
	Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Use non-const ref_format in *_atom_parser(), which can help us
modify the members of ref_format in *_atom_parser().

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 builtin/tag.c |  2 +-
 ref-filter.c  | 44 ++++++++++++++++++++++----------------------
 ref-filter.h  |  4 ++--
 3 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/builtin/tag.c b/builtin/tag.c
index 82fcfc09824..452558ec957 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -146,7 +146,7 @@ static int verify_tag(const char *name, const char *ref,
 		      const struct object_id *oid, void *cb_data)
 {
 	int flags;
-	const struct ref_format *format = cb_data;
+	struct ref_format *format = cb_data;
 	flags = GPG_VERIFY_VERBOSE;
 
 	if (format->format)
diff --git a/ref-filter.c b/ref-filter.c
index 0ff8574a00c..1277ed483a7 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -226,7 +226,7 @@ static int strbuf_addf_ret(struct strbuf *sb, int ret, const char *fmt, ...)
 	return ret;
 }
 
-static int color_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int color_atom_parser(struct ref_format *format, struct used_atom *atom,
 			     const char *color_value, struct strbuf *err)
 {
 	if (!color_value)
@@ -264,7 +264,7 @@ static int refname_atom_parser_internal(struct refname_atom *atom, const char *a
 	return 0;
 }
 
-static int remote_ref_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int remote_ref_atom_parser(struct ref_format *format, struct used_atom *atom,
 				  const char *arg, struct strbuf *err)
 {
 	struct string_list params = STRING_LIST_INIT_DUP;
@@ -311,7 +311,7 @@ static int remote_ref_atom_parser(const struct ref_format *format, struct used_a
 	return 0;
 }
 
-static int objecttype_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int objecttype_atom_parser(struct ref_format *format, struct used_atom *atom,
 				  const char *arg, struct strbuf *err)
 {
 	if (arg)
@@ -323,7 +323,7 @@ static int objecttype_atom_parser(const struct ref_format *format, struct used_a
 	return 0;
 }
 
-static int objectsize_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int objectsize_atom_parser(struct ref_format *format, struct used_atom *atom,
 				  const char *arg, struct strbuf *err)
 {
 	if (!arg) {
@@ -343,7 +343,7 @@ static int objectsize_atom_parser(const struct ref_format *format, struct used_a
 	return 0;
 }
 
-static int deltabase_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int deltabase_atom_parser(struct ref_format *format, struct used_atom *atom,
 				 const char *arg, struct strbuf *err)
 {
 	if (arg)
@@ -355,7 +355,7 @@ static int deltabase_atom_parser(const struct ref_format *format, struct used_at
 	return 0;
 }
 
-static int body_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int body_atom_parser(struct ref_format *format, struct used_atom *atom,
 			    const char *arg, struct strbuf *err)
 {
 	if (arg)
@@ -364,7 +364,7 @@ static int body_atom_parser(const struct ref_format *format, struct used_atom *a
 	return 0;
 }
 
-static int subject_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int subject_atom_parser(struct ref_format *format, struct used_atom *atom,
 			       const char *arg, struct strbuf *err)
 {
 	if (!arg)
@@ -376,7 +376,7 @@ static int subject_atom_parser(const struct ref_format *format, struct used_atom
 	return 0;
 }
 
-static int trailers_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int trailers_atom_parser(struct ref_format *format, struct used_atom *atom,
 				const char *arg, struct strbuf *err)
 {
 	atom->u.contents.trailer_opts.no_divider = 1;
@@ -402,7 +402,7 @@ static int trailers_atom_parser(const struct ref_format *format, struct used_ato
 	return 0;
 }
 
-static int contents_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int contents_atom_parser(struct ref_format *format, struct used_atom *atom,
 				const char *arg, struct strbuf *err)
 {
 	if (!arg)
@@ -430,7 +430,7 @@ static int contents_atom_parser(const struct ref_format *format, struct used_ato
 	return 0;
 }
 
-static int raw_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int raw_atom_parser(struct ref_format *format, struct used_atom *atom,
 				const char *arg, struct strbuf *err)
 {
 	if (!arg)
@@ -442,7 +442,7 @@ static int raw_atom_parser(const struct ref_format *format, struct used_atom *at
 	return 0;
 }
 
-static int oid_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int oid_atom_parser(struct ref_format *format, struct used_atom *atom,
 			   const char *arg, struct strbuf *err)
 {
 	if (!arg)
@@ -461,7 +461,7 @@ static int oid_atom_parser(const struct ref_format *format, struct used_atom *at
 	return 0;
 }
 
-static int person_email_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int person_email_atom_parser(struct ref_format *format, struct used_atom *atom,
 				    const char *arg, struct strbuf *err)
 {
 	if (!arg)
@@ -475,7 +475,7 @@ static int person_email_atom_parser(const struct ref_format *format, struct used
 	return 0;
 }
 
-static int refname_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int refname_atom_parser(struct ref_format *format, struct used_atom *atom,
 			       const char *arg, struct strbuf *err)
 {
 	return refname_atom_parser_internal(&atom->u.refname, arg, atom->name, err);
@@ -492,7 +492,7 @@ static align_type parse_align_position(const char *s)
 	return -1;
 }
 
-static int align_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int align_atom_parser(struct ref_format *format, struct used_atom *atom,
 			     const char *arg, struct strbuf *err)
 {
 	struct align *align = &atom->u.align;
@@ -544,7 +544,7 @@ static int align_atom_parser(const struct ref_format *format, struct used_atom *
 	return 0;
 }
 
-static int if_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int if_atom_parser(struct ref_format *format, struct used_atom *atom,
 			  const char *arg, struct strbuf *err)
 {
 	if (!arg) {
@@ -559,7 +559,7 @@ static int if_atom_parser(const struct ref_format *format, struct used_atom *ato
 	return 0;
 }
 
-static int head_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int head_atom_parser(struct ref_format *format, struct used_atom *atom,
 			    const char *arg, struct strbuf *unused_err)
 {
 	atom->u.head = resolve_refdup("HEAD", RESOLVE_REF_READING, NULL, NULL);
@@ -570,7 +570,7 @@ static struct {
 	const char *name;
 	info_source source;
 	cmp_type cmp_type;
-	int (*parser)(const struct ref_format *format, struct used_atom *atom,
+	int (*parser)(struct ref_format *format, struct used_atom *atom,
 		      const char *arg, struct strbuf *err);
 } valid_atom[] = {
 	[ATOM_REFNAME] = { "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser },
@@ -651,7 +651,7 @@ struct atom_value {
 /*
  * Used to parse format string and sort specifiers
  */
-static int parse_ref_filter_atom(const struct ref_format *format,
+static int parse_ref_filter_atom(struct ref_format *format,
 				 const char *atom, const char *ep,
 				 struct strbuf *err)
 {
@@ -2555,9 +2555,9 @@ static void append_literal(const char *cp, const char *ep, struct ref_formatting
 }
 
 int format_ref_array_item(struct ref_array_item *info,
-			   const struct ref_format *format,
-			   struct strbuf *final_buf,
-			   struct strbuf *error_buf)
+			  struct ref_format *format,
+			  struct strbuf *final_buf,
+			  struct strbuf *error_buf)
 {
 	const char *cp, *sp, *ep;
 	struct ref_formatting_state state = REF_FORMATTING_STATE_INIT;
@@ -2601,7 +2601,7 @@ int format_ref_array_item(struct ref_array_item *info,
 }
 
 void pretty_print_ref(const char *name, const struct object_id *oid,
-		      const struct ref_format *format)
+		      struct ref_format *format)
 {
 	struct ref_array_item *ref_item;
 	struct strbuf output = STRBUF_INIT;
diff --git a/ref-filter.h b/ref-filter.h
index baf72a71896..74fb423fc89 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -116,7 +116,7 @@ void ref_array_sort(struct ref_sorting *sort, struct ref_array *array);
 void ref_sorting_set_sort_flags_all(struct ref_sorting *sorting, unsigned int mask, int on);
 /*  Based on the given format and quote_style, fill the strbuf */
 int format_ref_array_item(struct ref_array_item *info,
-			  const struct ref_format *format,
+			  struct ref_format *format,
 			  struct strbuf *final_buf,
 			  struct strbuf *error_buf);
 /*  Parse a single sort specifier and add it to the list */
@@ -137,7 +137,7 @@ void setup_ref_filter_porcelain_msg(void);
  * name must be a fully qualified refname.
  */
 void pretty_print_ref(const char *name, const struct object_id *oid,
-		      const struct ref_format *format);
+		      struct ref_format *format);
 
 /*
  * Push a single ref onto the array; this can be used to construct your own
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 5/5] [GSOC] ref-filter: add %(rest) atom
  2021-07-23  9:03 ` [PATCH v2 " ZheNing Hu via GitGitGadget
                     ` (3 preceding siblings ...)
  2021-07-23  9:04   ` [PATCH v2 4/5] [GSOC] ref-filter: use non-const ref_format in *_atom_parser() ZheNing Hu via GitGitGadget
@ 2021-07-23  9:04   ` ZheNing Hu via GitGitGadget
  2021-07-24 14:14   ` [PATCH v3 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms ZheNing Hu via GitGitGadget
  5 siblings, 0 replies; 45+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-07-23  9:04 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
	Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

In order to let "cat-file --batch=%(rest)" use the ref-filter
interface, add %(rest) atom for ref-filter. Introduce the
reject_atom() to reject the atom %(rest) for "git for-each-ref",
"git branch", "git tag" and "git verify-tag".

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 ref-filter.c             | 25 +++++++++++++++++++++++++
 ref-filter.h             |  5 ++++-
 t/t3203-branch-output.sh |  4 ++++
 t/t6300-for-each-ref.sh  |  4 ++++
 t/t7004-tag.sh           |  4 ++++
 t/t7030-verify-tag.sh    |  4 ++++
 6 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/ref-filter.c b/ref-filter.c
index 1277ed483a7..c4cfff10822 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -157,6 +157,7 @@ enum atom_type {
 	ATOM_IF,
 	ATOM_THEN,
 	ATOM_ELSE,
+	ATOM_REST,
 };
 
 /*
@@ -559,6 +560,15 @@ static int if_atom_parser(struct ref_format *format, struct used_atom *atom,
 	return 0;
 }
 
+static int rest_atom_parser(struct ref_format *format, struct used_atom *atom,
+			    const char *arg, struct strbuf *err)
+{
+	if (arg)
+		return strbuf_addf_ret(err, -1, _("%%(rest) does not take arguments"));
+	format->use_rest = 1;
+	return 0;
+}
+
 static int head_atom_parser(struct ref_format *format, struct used_atom *atom,
 			    const char *arg, struct strbuf *unused_err)
 {
@@ -615,6 +625,7 @@ static struct {
 	[ATOM_IF] = { "if", SOURCE_NONE, FIELD_STR, if_atom_parser },
 	[ATOM_THEN] = { "then", SOURCE_NONE },
 	[ATOM_ELSE] = { "else", SOURCE_NONE },
+	[ATOM_REST] = { "rest", SOURCE_NONE, FIELD_STR, rest_atom_parser },
 	/*
 	 * Please update $__git_ref_fieldlist in git-completion.bash
 	 * when you add new atoms
@@ -991,6 +1002,11 @@ static const char *find_next(const char *cp)
 	return NULL;
 }
 
+static int reject_atom(enum atom_type atom_type)
+{
+	return atom_type == ATOM_REST;
+}
+
 /*
  * Make sure the format string is well formed, and parse out
  * the used atoms.
@@ -1011,6 +1027,8 @@ int verify_ref_format(struct ref_format *format)
 		at = parse_ref_filter_atom(format, sp + 2, ep, &err);
 		if (at < 0)
 			die("%s", err.buf);
+		if (reject_atom(used_atom[at].atom_type))
+			die(_("this command reject atom %%(%.*s)"), (int)(ep - sp - 2), sp + 2);
 
 		if ((format->quote_style == QUOTE_PYTHON ||
 		     format->quote_style == QUOTE_SHELL ||
@@ -1930,6 +1948,12 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 			v->handler = else_atom_handler;
 			v->s = xstrdup("");
 			continue;
+		} else if (atom_type == ATOM_REST) {
+			if (ref->rest)
+				v->s = xstrdup(ref->rest);
+			else
+				v->s = xstrdup("");
+			continue;
 		} else
 			continue;
 
@@ -2147,6 +2171,7 @@ static struct ref_array_item *new_ref_array_item(const char *refname,
 
 	FLEX_ALLOC_STR(ref, refname, refname);
 	oidcpy(&ref->objectname, oid);
+	ref->rest = NULL;
 
 	return ref;
 }
diff --git a/ref-filter.h b/ref-filter.h
index 74fb423fc89..c15dee8d6b9 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -38,6 +38,7 @@ struct ref_sorting {
 
 struct ref_array_item {
 	struct object_id objectname;
+	const char *rest;
 	int flag;
 	unsigned int kind;
 	const char *symref;
@@ -76,14 +77,16 @@ struct ref_format {
 	 * verify_ref_format() afterwards to finalize.
 	 */
 	const char *format;
+	const char *rest;
 	int quote_style;
+	int use_rest;
 	int use_color;
 
 	/* Internal state to ref-filter */
 	int need_color_reset_at_eol;
 };
 
-#define REF_FORMAT_INIT { NULL, 0, -1 }
+#define REF_FORMAT_INIT { .use_color = -1 }
 
 /*  Macros for checking --merged and --no-merged options */
 #define _OPT_MERGED_NO_MERGED(option, filter, h) \
diff --git a/t/t3203-branch-output.sh b/t/t3203-branch-output.sh
index 5325b9f67a0..6e94c6db7b5 100755
--- a/t/t3203-branch-output.sh
+++ b/t/t3203-branch-output.sh
@@ -340,6 +340,10 @@ test_expect_success 'git branch --format option' '
 	test_cmp expect actual
 '
 
+test_expect_success 'git branch with --format=%(rest) must fail' '
+	test_must_fail git branch --format="%(rest)" >actual
+'
+
 test_expect_success 'worktree colors correct' '
 	cat >expect <<-EOF &&
 	* <GREEN>(HEAD detached from fromtag)<RESET>
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 3d15d0a5360..0d2e062f791 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -1211,6 +1211,10 @@ test_expect_success 'basic atom: head contents:trailers' '
 	test_cmp expect actual.clean
 '
 
+test_expect_success 'basic atom: rest must fail' '
+	test_must_fail git for-each-ref --format="%(rest)" refs/heads/main
+'
+
 test_expect_success 'trailer parsing not fooled by --- line' '
 	git commit --allow-empty -F - <<-\EOF &&
 	this is the subject
diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh
index 2f72c5c6883..082be85dffc 100755
--- a/t/t7004-tag.sh
+++ b/t/t7004-tag.sh
@@ -1998,6 +1998,10 @@ test_expect_success '--format should list tags as per format given' '
 	test_cmp expect actual
 '
 
+test_expect_success 'git tag -l with --format="%(rest)" must fail' '
+	test_must_fail git tag -l --format="%(rest)" "v1*"
+'
+
 test_expect_success "set up color tests" '
 	echo "<RED>v1.0<RESET>" >expect.color &&
 	echo "v1.0" >expect.bare &&
diff --git a/t/t7030-verify-tag.sh b/t/t7030-verify-tag.sh
index 3cefde9602b..10faa645157 100755
--- a/t/t7030-verify-tag.sh
+++ b/t/t7030-verify-tag.sh
@@ -194,6 +194,10 @@ test_expect_success GPG 'verifying tag with --format' '
 	test_cmp expect actual
 '
 
+test_expect_success GPG 'verifying tag with --format="%(rest)" must fail' '
+	test_must_fail git verify-tag --format="%(rest)" "fourth-signed"
+'
+
 test_expect_success GPG 'verifying a forged tag with --format should fail silently' '
 	test_must_fail git verify-tag --format="tagname : %(tag)" $(cat forged1.tag) >actual-forged &&
 	test_must_be_empty actual-forged
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 5/5] [GSOC] ref-filter: add %(rest) atom
  2021-07-22  7:39 ` [PATCH 5/5] [GSOC] ref-filter: add %(rest) atom ZheNing Hu via GitGitGadget
  2021-07-22  8:11   ` Ævar Arnfjörð Bjarmason
@ 2021-07-23  9:09   ` Jacob Keller
  2021-07-23 14:11     ` ZheNing Hu
  1 sibling, 1 reply; 45+ messages in thread
From: Jacob Keller @ 2021-07-23  9:09 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: Git mailing list, Junio C Hamano, Christian Couder, Hariom Verma,
	Bagas Sanjaya, Jeff King, Ævar Arnfjörð Bjarmason,
	Eric Sunshine, Philip Oakley, ZheNing Hu

On Thu, Jul 22, 2021 at 12:40 AM ZheNing Hu via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: ZheNing Hu <adlternative@gmail.com>
>
> In order to let "cat-file --batch=%(rest)" use the ref-filter
> interface, add %(rest) atom for ref-filter. Introduce the
> reject_atom() to reject the atom %(rest) for "git for-each-ref",
> "git branch", "git tag" and "git verify-tag".
>

To a reader intimately familiar with cat-file, %(rest) might be
obvious.. but it would help the commit message read more easily if
there was a short explanation of what %(rest) did. It is not obvious
just from the name to me.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 5/5] [GSOC] ref-filter: add %(rest) atom
  2021-07-23  9:09   ` Jacob Keller
@ 2021-07-23 14:11     ` ZheNing Hu
  0 siblings, 0 replies; 45+ messages in thread
From: ZheNing Hu @ 2021-07-23 14:11 UTC (permalink / raw)
  To: Jacob Keller
  Cc: ZheNing Hu via GitGitGadget, Git mailing list, Junio C Hamano,
	Christian Couder, Hariom Verma, Bagas Sanjaya, Jeff King,
	Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley

Jacob Keller <jacob.keller@gmail.com> 于2021年7月23日周五 下午5:09写道:
>
> On Thu, Jul 22, 2021 at 12:40 AM ZheNing Hu via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> >
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > In order to let "cat-file --batch=%(rest)" use the ref-filter
> > interface, add %(rest) atom for ref-filter. Introduce the
> > reject_atom() to reject the atom %(rest) for "git for-each-ref",
> > "git branch", "git tag" and "git verify-tag".
> >
>
> To a reader intimately familiar with cat-file, %(rest) might be
> obvious.. but it would help the commit message read more easily if
> there was a short explanation of what %(rest) did. It is not obvious
> just from the name to me.
>

You are right, I may subconsciously think that everyone knows the
%(rest) in cat-file.
I will change it.

> Thanks,
> Jake

Thanks.
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 2/5] [GSOC] ref-filter: add %(raw) atom
  2021-07-23  9:04   ` [PATCH v2 2/5] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
@ 2021-07-23 16:38     ` Junio C Hamano
  2021-07-24  7:57       ` ZheNing Hu
  0 siblings, 1 reply; 45+ messages in thread
From: Junio C Hamano @ 2021-07-23 16:38 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Christian Couder, Hariom Verma, Bagas Sanjaya, Jeff King,
	Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, ZheNing Hu

"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:

>  struct atom_value {
>  	const char *s;
> +	size_t s_size;
>  	int (*handler)(struct atom_value *atomv, struct ref_formatting_state *state,
>  		       struct strbuf *err);
>  	uintmax_t value; /* used for sorting when not FIELD_STR */
>  	struct used_atom *atom;
>  };
>  
> +#define ATOM_VALUE_INIT { \
> +	.s_size = -1 \
> +}

This will get turned into size_t that is unsigned, and comparisons
like this one and ...

>  	case QUOTE_NONE:
> -		strbuf_addstr(s, str);
> +		if (len != -1)
> +			strbuf_add(s, str, len);
> +		else
> +			strbuf_addstr(s, str);

this one ...

> +		if (v->s_size != -1)
> +			strbuf_add(&state->stack->output, v->s, v->s_size);
> +		else
> +			strbuf_addstr(&state->stack->output, v->s);
>  	return 0;
>  }

may work as expected, but it makes readers wonder if there is
significance to negative values other than -1 (in other words, what
does it mean if v->s_size == -2, for example?).

It may make sense to 

 * Turn atom_value.s_size field into ssize_t instead of size_t

 * Rewrite (v->s_size != -1) comparison to (v->s_size < 0)


Optionally we could introduce #define ATOM_SIZE_UNSPECIFIED (-1) and
use it to initialize .s_size in ATOM_VALUE_INIT, but if we are just
going to test "is it negative, then it is not given", then it probably
is not needed.

Other changes in the whole series relative to what has been queued
looked all sensible to me.

THanks.




   

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 2/5] [GSOC] ref-filter: add %(raw) atom
  2021-07-23 16:38     ` Junio C Hamano
@ 2021-07-24  7:57       ` ZheNing Hu
  2021-07-24 17:41         ` Junio C Hamano
  0 siblings, 1 reply; 45+ messages in thread
From: ZheNing Hu @ 2021-07-24  7:57 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Hariom Verma, Bagas Sanjaya, Jeff King,
	Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley

Junio C Hamano <gitster@pobox.com> 于2021年7月24日周六 上午12:38写道:
>
> "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> >  struct atom_value {
> >       const char *s;
> > +     size_t s_size;
> >       int (*handler)(struct atom_value *atomv, struct ref_formatting_state *state,
> >                      struct strbuf *err);
> >       uintmax_t value; /* used for sorting when not FIELD_STR */
> >       struct used_atom *atom;
> >  };
> >
> > +#define ATOM_VALUE_INIT { \
> > +     .s_size = -1 \
> > +}
>
> This will get turned into size_t that is unsigned, and comparisons
> like this one and ...
>
> >       case QUOTE_NONE:
> > -             strbuf_addstr(s, str);
> > +             if (len != -1)
> > +                     strbuf_add(s, str, len);
> > +             else
> > +                     strbuf_addstr(s, str);
>
> this one ...
>
> > +             if (v->s_size != -1)
> > +                     strbuf_add(&state->stack->output, v->s, v->s_size);
> > +             else
> > +                     strbuf_addstr(&state->stack->output, v->s);
> >       return 0;
> >  }
>
> may work as expected, but it makes readers wonder if there is
> significance to negative values other than -1 (in other words, what
> does it mean if v->s_size == -2, for example?).
>

Yes, any value less than -1 is meaningless here.

> It may make sense to
>
>  * Turn atom_value.s_size field into ssize_t instead of size_t
>

Will the conversion of size_t and ssize_t break -Wsign-conversion?
Although there is a lot of code in Git that has broken it, but I am not
sure if it is wise to use ssize_t here.

>  * Rewrite (v->s_size != -1) comparison to (v->s_size < 0)
>

For size_t, this scheme is wrong.

>
> Optionally we could introduce #define ATOM_SIZE_UNSPECIFIED (-1) and
> use it to initialize .s_size in ATOM_VALUE_INIT, but if we are just
> going to test "is it negative, then it is not given", then it probably
> is not needed.
>

It seems that this is the only method left. Although I think
ATOM_SIZE_UNSPECIFIED
is not very useful becasue we already have ATOM_VALUE_INIT.

> Other changes in the whole series relative to what has been queued
> looked all sensible to me.
>
> THanks.
>

Thanks.
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v3 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms
  2021-07-23  9:03 ` [PATCH v2 " ZheNing Hu via GitGitGadget
                     ` (4 preceding siblings ...)
  2021-07-23  9:04   ` [PATCH v2 5/5] [GSOC] ref-filter: add %(rest) atom ZheNing Hu via GitGitGadget
@ 2021-07-24 14:14   ` ZheNing Hu via GitGitGadget
  2021-07-24 14:14     ` [PATCH v3 1/5] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
                       ` (5 more replies)
  5 siblings, 6 replies; 45+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-07-24 14:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
	Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, Jacob Keller, ZheNing Hu

This patch series is split from my main patch series zh/ref-filter-raw-data,
to make things easier for reviewers to read. It's last version is here:
https://lore.kernel.org/git/pull.993.v2.git.1626363626.gitgitgadget@gmail.com/

This patch series provided %(raw) and %(rest) for ref-filter, which will be
used by zh/cat-file-reuse-ref-filter-logic later.

v2--->v3:

 1. Add macro ATOM_SIZE_UNSPECIFIED to represent the initial value of
    atom_value->s_size.
 2. Add a description about %(rest) in the commit message. (copy from
    https://git-scm.com/docs/git-cat-file)

v2:
https://lore.kernel.org/git/CAP8UFD3ENBZdm8yQn6J-yb4PvUsrzTA3KGxes42dE3bTrYbr9Q@mail.gmail.com/

ZheNing Hu (5):
  [GSOC] ref-filter: add obj-type check in grab contents
  [GSOC] ref-filter: add %(raw) atom
  [GSOC] ref-filter: --format=%(raw) re-support --perl
  [GSOC] ref-filter: use non-const ref_format in *_atom_parser()
  [GSOC] ref-filter: add %(rest) atom

 Documentation/git-for-each-ref.txt |   9 ++
 builtin/tag.c                      |   2 +-
 quote.c                            |  17 ++
 quote.h                            |   1 +
 ref-filter.c                       | 244 ++++++++++++++++++++++-------
 ref-filter.h                       |   9 +-
 t/t3203-branch-output.sh           |   4 +
 t/t6300-for-each-ref.sh            | 235 +++++++++++++++++++++++++++
 t/t7004-tag.sh                     |   4 +
 t/t7030-verify-tag.sh              |   4 +
 10 files changed, 466 insertions(+), 63 deletions(-)


base-commit: daab8a564f8bbac55f70f8bf86c070e001a9b006
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1000%2Fadlternative%2Fref-filter-raw-data-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1000/adlternative/ref-filter-raw-data-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/1000

Range-diff vs v2:

 1:  d77268bb188 = 1:  d77268bb188 [GSOC] ref-filter: add obj-type check in grab contents
 2:  eafb79bad62 ! 2:  0ffa549da6e [GSOC] ref-filter: add %(raw) atom
     @@ ref-filter.c: struct ref_formatting_state {
       	struct used_atom *atom;
       };
       
     ++#define ATOM_SIZE_UNSPECIFIED (-1)
     ++
      +#define ATOM_VALUE_INIT { \
     -+	.s_size = -1 \
     ++	.s_size = ATOM_SIZE_UNSPECIFIED \
      +}
      +
       /*
     @@ ref-filter.c: static int append_atom(struct atom_value *v, struct ref_formatting
      +		quote_formatting(&state->stack->output, v->s, v->s_size, state->quote_style);
       	else
      -		strbuf_addstr(&state->stack->output, v->s);
     -+		if (v->s_size != -1)
     ++		if (v->s_size != ATOM_SIZE_UNSPECIFIED)
      +			strbuf_add(&state->stack->output, v->s, v->s_size);
      +		else
      +			strbuf_addstr(&state->stack->output, v->s);
     @@ ref-filter.c: static int populate_value(struct ref_array_item *ref, struct strbu
       		const char *refname;
       		struct branch *branch = NULL;
       
     -+		v->s_size = -1;
     ++		v->s_size = ATOM_SIZE_UNSPECIFIED;
       		v->handler = append_atom;
       		v->atom = atom;
       
     @@ ref-filter.c: static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array
      -		cmp_fn = s->sort_flags & REF_SORTING_ICASE
      -			? strcasecmp : strcmp;
      -		cmp = cmp_fn(va->s, vb->s);
     -+		if (va->s_size == -1 && vb->s_size == -1) {
     ++		if (va->s_size == ATOM_SIZE_UNSPECIFIED && vb->s_size == ATOM_SIZE_UNSPECIFIED) {
      +			int (*cmp_fn)(const char *, const char *);
      +			cmp_fn = s->sort_flags & REF_SORTING_ICASE
      +				? strcasecmp : strcmp;
      +			cmp = cmp_fn(va->s, vb->s);
      +		} else {
     -+			size_t a_size = va->s_size == -1 ?
     ++			size_t a_size = va->s_size == ATOM_SIZE_UNSPECIFIED ?
      +					strlen(va->s) : va->s_size;
     -+			size_t b_size = vb->s_size == -1 ?
     ++			size_t b_size = vb->s_size == ATOM_SIZE_UNSPECIFIED ?
      +					strlen(vb->s) : vb->s_size;
      +			int (*cmp_fn)(const void *, const void *, size_t);
      +			cmp_fn = s->sort_flags & REF_SORTING_ICASE
 3:  00a29dd16d7 = 3:  ac5d98647da [GSOC] ref-filter: --format=%(raw) re-support --perl
 4:  3a8173b42ed = 4:  0722f546df9 [GSOC] ref-filter: use non-const ref_format in *_atom_parser()
 5:  71249dfd4e0 ! 5:  2300b0463f2 [GSOC] ref-filter: add %(rest) atom
     @@ Metadata
       ## Commit message ##
          [GSOC] ref-filter: add %(rest) atom
      
     +    %(rest) is a atom used for cat-file batch mode, which can split
     +    the input lines at the first whitespace boundary, all characters
     +    before that whitespace are considered to be the object name;
     +    characters after that first run of whitespace (i.e., the "rest"
     +    of the line) are output in place of the %(rest) atom.
     +
          In order to let "cat-file --batch=%(rest)" use the ref-filter
     -    interface, add %(rest) atom for ref-filter. Introduce the
     -    reject_atom() to reject the atom %(rest) for "git for-each-ref",
     -    "git branch", "git tag" and "git verify-tag".
     +    interface, add %(rest) atom for ref-filter.
     +
     +    Introduce the reject_atom() to reject the atom %(rest) for
     +    "git for-each-ref", "git branch", "git tag" and "git verify-tag".
      
     +    Suggected-by: Jacob Keller <jacob.keller@gmail.com>
          Mentored-by: Christian Couder <christian.couder@gmail.com>
          Mentored-by: Hariom Verma <hariom18599@gmail.com>
          Signed-off-by: ZheNing Hu <adlternative@gmail.com>

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v3 1/5] [GSOC] ref-filter: add obj-type check in grab contents
  2021-07-24 14:14   ` [PATCH v3 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms ZheNing Hu via GitGitGadget
@ 2021-07-24 14:14     ` ZheNing Hu via GitGitGadget
  2021-07-25  8:22       ` Jacob Keller
  2021-07-24 14:14     ` [PATCH v3 2/5] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
                       ` (4 subsequent siblings)
  5 siblings, 1 reply; 45+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-07-24 14:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
	Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, Jacob Keller, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Only tag and commit objects use `grab_sub_body_contents()` to grab
object contents in the current codebase.  We want to teach the
function to also handle blobs and trees to get their raw data,
without parsing a blob (whose contents looks like a commit or a tag)
incorrectly as a commit or a tag. So it's needed to pass a
`struct expand_data *data` instread of only `void *buf` to both
`grab_sub_body_contents()` and `grab_values()` to be able to check
the object type.

Skip the block of code that is specific to handling commits and tags
early when the given object is of a wrong type to help later
addition to handle other types of objects in this function.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 ref-filter.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 4db0e40ff4c..5cee6512fba 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1356,11 +1356,12 @@ static void append_lines(struct strbuf *out, const char *buf, unsigned long size
 }
 
 /* See grab_values */
-static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
+static void grab_sub_body_contents(struct atom_value *val, int deref, struct expand_data *data)
 {
 	int i;
 	const char *subpos = NULL, *bodypos = NULL, *sigpos = NULL;
 	size_t sublen = 0, bodylen = 0, nonsiglen = 0, siglen = 0;
+	void *buf = data->content;
 
 	for (i = 0; i < used_atom_cnt; i++) {
 		struct used_atom *atom = &used_atom[i];
@@ -1371,10 +1372,13 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
 			continue;
 		if (deref)
 			name++;
-		if (strcmp(name, "body") &&
-		    !starts_with(name, "subject") &&
-		    !starts_with(name, "trailers") &&
-		    !starts_with(name, "contents"))
+
+		if ((data->type != OBJ_TAG &&
+		     data->type != OBJ_COMMIT) ||
+		    (strcmp(name, "body") &&
+		     !starts_with(name, "subject") &&
+		     !starts_with(name, "trailers") &&
+		     !starts_with(name, "contents")))
 			continue;
 		if (!subpos)
 			find_subpos(buf,
@@ -1438,17 +1442,19 @@ static void fill_missing_values(struct atom_value *val)
  * pointed at by the ref itself; otherwise it is the object the
  * ref (which is a tag) refers to.
  */
-static void grab_values(struct atom_value *val, int deref, struct object *obj, void *buf)
+static void grab_values(struct atom_value *val, int deref, struct object *obj, struct expand_data *data)
 {
+	void *buf = data->content;
+
 	switch (obj->type) {
 	case OBJ_TAG:
 		grab_tag_values(val, deref, obj);
-		grab_sub_body_contents(val, deref, buf);
+		grab_sub_body_contents(val, deref, data);
 		grab_person("tagger", val, deref, buf);
 		break;
 	case OBJ_COMMIT:
 		grab_commit_values(val, deref, obj);
-		grab_sub_body_contents(val, deref, buf);
+		grab_sub_body_contents(val, deref, data);
 		grab_person("author", val, deref, buf);
 		grab_person("committer", val, deref, buf);
 		break;
@@ -1678,7 +1684,7 @@ static int get_object(struct ref_array_item *ref, int deref, struct object **obj
 			return strbuf_addf_ret(err, -1, _("parse_object_buffer failed on %s for %s"),
 					       oid_to_hex(&oi->oid), ref->refname);
 		}
-		grab_values(ref->value, deref, *obj, oi->content);
+		grab_values(ref->value, deref, *obj, oi);
 	}
 
 	grab_common_values(ref->value, deref, oi);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v3 2/5] [GSOC] ref-filter: add %(raw) atom
  2021-07-24 14:14   ` [PATCH v3 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms ZheNing Hu via GitGitGadget
  2021-07-24 14:14     ` [PATCH v3 1/5] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
@ 2021-07-24 14:14     ` ZheNing Hu via GitGitGadget
  2021-07-24 14:14     ` [PATCH v3 3/5] [GSOC] ref-filter: --format=%(raw) re-support --perl ZheNing Hu via GitGitGadget
                       ` (3 subsequent siblings)
  5 siblings, 0 replies; 45+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-07-24 14:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
	Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, Jacob Keller, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Add new formatting option `%(raw)`, which will print the raw
object data without any changes. It will help further to migrate
all cat-file formatting logic from cat-file to ref-filter.

The raw data of blob, tree objects may contain '\0', but most of
the logic in `ref-filter` depends on the output of the atom being
text (specifically, no embedded NULs in it).

E.g. `quote_formatting()` use `strbuf_addstr()` or `*._quote_buf()`
add the data to the buffer. The raw data of a tree object is
`100644 one\0...`, only the `100644 one` will be added to the buffer,
which is incorrect.

Therefore, we need to find a way to record the length of the
atom_value's member `s`. Although strbuf can already record the
string and its length, if we want to replace the type of atom_value's
member `s` with strbuf, many places in ref-filter that are filled
with dynamically allocated mermory in `v->s` are not easy to replace.
At the same time, we need to check if `v->s == NULL` in
populate_value(), and strbuf cannot easily distinguish NULL and empty
strings, but c-style "const char *" can do it. So add a new member in
`struct atom_value`: `s_size`, which can record raw object size, it
can help us add raw object data to the buffer or compare two buffers
which contain raw object data.

Note that `--format=%(raw)` cannot be used with `--python`, `--shell`,
`--tcl`, and `--perl` because if the binary raw data is passed to a
variable in such languages, these may not support arbitrary binary data
in their string variable type.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Helped-by: Bagas Sanjaya <bagasdotme@gmail.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Felipe Contreras <felipe.contreras@gmail.com>
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Based-on-patch-by: Olga Telezhnaya <olyatelezhnaya@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 Documentation/git-for-each-ref.txt |   9 ++
 ref-filter.c                       | 144 +++++++++++++++----
 t/t6300-for-each-ref.sh            | 216 +++++++++++++++++++++++++++++
 3 files changed, 341 insertions(+), 28 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index 2ae2478de70..cbb6f87d13f 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -235,6 +235,15 @@ and `date` to extract the named component.  For email fields (`authoremail`,
 without angle brackets, and `:localpart` to get the part before the `@` symbol
 out of the trimmed email.
 
+The raw data in an object is `raw`.
+
+raw:size::
+	The raw data size of the object.
+
+Note that `--format=%(raw)` can not be used with `--python`, `--shell`, `--tcl`,
+`--perl` because such language may not support arbitrary binary data in their
+string variable type.
+
 The message in a commit or a tag object is `contents`, from which
 `contents:<part>` can be used to extract various parts out of:
 
diff --git a/ref-filter.c b/ref-filter.c
index 5cee6512fba..01545618101 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -144,6 +144,7 @@ enum atom_type {
 	ATOM_BODY,
 	ATOM_TRAILERS,
 	ATOM_CONTENTS,
+	ATOM_RAW,
 	ATOM_UPSTREAM,
 	ATOM_PUSH,
 	ATOM_SYMREF,
@@ -189,6 +190,9 @@ static struct used_atom {
 			struct process_trailer_options trailer_opts;
 			unsigned int nlines;
 		} contents;
+		struct {
+			enum { RAW_BARE, RAW_LENGTH } option;
+		} raw_data;
 		struct {
 			cmp_status cmp_status;
 			const char *str;
@@ -426,6 +430,18 @@ static int contents_atom_parser(const struct ref_format *format, struct used_ato
 	return 0;
 }
 
+static int raw_atom_parser(const struct ref_format *format, struct used_atom *atom,
+				const char *arg, struct strbuf *err)
+{
+	if (!arg)
+		atom->u.raw_data.option = RAW_BARE;
+	else if (!strcmp(arg, "size"))
+		atom->u.raw_data.option = RAW_LENGTH;
+	else
+		return strbuf_addf_ret(err, -1, _("unrecognized %%(raw) argument: %s"), arg);
+	return 0;
+}
+
 static int oid_atom_parser(const struct ref_format *format, struct used_atom *atom,
 			   const char *arg, struct strbuf *err)
 {
@@ -586,6 +602,7 @@ static struct {
 	[ATOM_BODY] = { "body", SOURCE_OBJ, FIELD_STR, body_atom_parser },
 	[ATOM_TRAILERS] = { "trailers", SOURCE_OBJ, FIELD_STR, trailers_atom_parser },
 	[ATOM_CONTENTS] = { "contents", SOURCE_OBJ, FIELD_STR, contents_atom_parser },
+	[ATOM_RAW] = { "raw", SOURCE_OBJ, FIELD_STR, raw_atom_parser },
 	[ATOM_UPSTREAM] = { "upstream", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
 	[ATOM_PUSH] = { "push", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
 	[ATOM_SYMREF] = { "symref", SOURCE_NONE, FIELD_STR, refname_atom_parser },
@@ -620,12 +637,19 @@ struct ref_formatting_state {
 
 struct atom_value {
 	const char *s;
+	size_t s_size;
 	int (*handler)(struct atom_value *atomv, struct ref_formatting_state *state,
 		       struct strbuf *err);
 	uintmax_t value; /* used for sorting when not FIELD_STR */
 	struct used_atom *atom;
 };
 
+#define ATOM_SIZE_UNSPECIFIED (-1)
+
+#define ATOM_VALUE_INIT { \
+	.s_size = ATOM_SIZE_UNSPECIFIED \
+}
+
 /*
  * Used to parse format string and sort specifiers
  */
@@ -644,13 +668,6 @@ static int parse_ref_filter_atom(const struct ref_format *format,
 		return strbuf_addf_ret(err, -1, _("malformed field name: %.*s"),
 				       (int)(ep-atom), atom);
 
-	/* Do we have the atom already used elsewhere? */
-	for (i = 0; i < used_atom_cnt; i++) {
-		int len = strlen(used_atom[i].name);
-		if (len == ep - atom && !memcmp(used_atom[i].name, atom, len))
-			return i;
-	}
-
 	/*
 	 * If the atom name has a colon, strip it and everything after
 	 * it off - it specifies the format for this entry, and
@@ -660,6 +677,13 @@ static int parse_ref_filter_atom(const struct ref_format *format,
 	arg = memchr(sp, ':', ep - sp);
 	atom_len = (arg ? arg : ep) - sp;
 
+	/* Do we have the atom already used elsewhere? */
+	for (i = 0; i < used_atom_cnt; i++) {
+		int len = strlen(used_atom[i].name);
+		if (len == ep - atom && !memcmp(used_atom[i].name, atom, len))
+			return i;
+	}
+
 	/* Is the atom a valid one? */
 	for (i = 0; i < ARRAY_SIZE(valid_atom); i++) {
 		int len = strlen(valid_atom[i].name);
@@ -709,11 +733,14 @@ static int parse_ref_filter_atom(const struct ref_format *format,
 	return at;
 }
 
-static void quote_formatting(struct strbuf *s, const char *str, int quote_style)
+static void quote_formatting(struct strbuf *s, const char *str, size_t len, int quote_style)
 {
 	switch (quote_style) {
 	case QUOTE_NONE:
-		strbuf_addstr(s, str);
+		if (len != -1)
+			strbuf_add(s, str, len);
+		else
+			strbuf_addstr(s, str);
 		break;
 	case QUOTE_SHELL:
 		sq_quote_buf(s, str);
@@ -740,9 +767,12 @@ static int append_atom(struct atom_value *v, struct ref_formatting_state *state,
 	 * encountered.
 	 */
 	if (!state->stack->prev)
-		quote_formatting(&state->stack->output, v->s, state->quote_style);
+		quote_formatting(&state->stack->output, v->s, v->s_size, state->quote_style);
 	else
-		strbuf_addstr(&state->stack->output, v->s);
+		if (v->s_size != ATOM_SIZE_UNSPECIFIED)
+			strbuf_add(&state->stack->output, v->s, v->s_size);
+		else
+			strbuf_addstr(&state->stack->output, v->s);
 	return 0;
 }
 
@@ -842,21 +872,23 @@ static int if_atom_handler(struct atom_value *atomv, struct ref_formatting_state
 	return 0;
 }
 
-static int is_empty(const char *s)
+static int is_empty(struct strbuf *buf)
 {
-	while (*s != '\0') {
-		if (!isspace(*s))
-			return 0;
-		s++;
-	}
-	return 1;
-}
+	const char *cur = buf->buf;
+	const char *end = buf->buf + buf->len;
+
+	while (cur != end && (isspace(*cur)))
+		cur++;
+
+	return cur == end;
+ }
 
 static int then_atom_handler(struct atom_value *atomv, struct ref_formatting_state *state,
 			     struct strbuf *err)
 {
 	struct ref_formatting_stack *cur = state->stack;
 	struct if_then_else *if_then_else = NULL;
+	size_t str_len = 0;
 
 	if (cur->at_end == if_then_else_handler)
 		if_then_else = (struct if_then_else *)cur->at_end_data;
@@ -867,18 +899,22 @@ static int then_atom_handler(struct atom_value *atomv, struct ref_formatting_sta
 	if (if_then_else->else_atom_seen)
 		return strbuf_addf_ret(err, -1, _("format: %%(then) atom used after %%(else)"));
 	if_then_else->then_atom_seen = 1;
+	if (if_then_else->str)
+		str_len = strlen(if_then_else->str);
 	/*
 	 * If the 'equals' or 'notequals' attribute is used then
 	 * perform the required comparison. If not, only non-empty
 	 * strings satisfy the 'if' condition.
 	 */
 	if (if_then_else->cmp_status == COMPARE_EQUAL) {
-		if (!strcmp(if_then_else->str, cur->output.buf))
+		if (str_len == cur->output.len &&
+		    !memcmp(if_then_else->str, cur->output.buf, cur->output.len))
 			if_then_else->condition_satisfied = 1;
 	} else if (if_then_else->cmp_status == COMPARE_UNEQUAL) {
-		if (strcmp(if_then_else->str, cur->output.buf))
+		if (str_len != cur->output.len ||
+		    memcmp(if_then_else->str, cur->output.buf, cur->output.len))
 			if_then_else->condition_satisfied = 1;
-	} else if (cur->output.len && !is_empty(cur->output.buf))
+	} else if (cur->output.len && !is_empty(&cur->output))
 		if_then_else->condition_satisfied = 1;
 	strbuf_reset(&cur->output);
 	return 0;
@@ -924,7 +960,7 @@ static int end_atom_handler(struct atom_value *atomv, struct ref_formatting_stat
 	 * only on the topmost supporting atom.
 	 */
 	if (!current->prev->prev) {
-		quote_formatting(&s, current->output.buf, state->quote_style);
+		quote_formatting(&s, current->output.buf, current->output.len, state->quote_style);
 		strbuf_swap(&current->output, &s);
 	}
 	strbuf_release(&s);
@@ -974,6 +1010,10 @@ int verify_ref_format(struct ref_format *format)
 		at = parse_ref_filter_atom(format, sp + 2, ep, &err);
 		if (at < 0)
 			die("%s", err.buf);
+		if (format->quote_style && used_atom[at].atom_type == ATOM_RAW &&
+		    used_atom[at].u.raw_data.option == RAW_BARE)
+			die(_("--format=%.*s cannot be used with"
+			      "--python, --shell, --tcl, --perl"), (int)(ep - sp - 2), sp + 2);
 		cp = ep + 1;
 
 		if (skip_prefix(used_atom[at].name, "color:", &color))
@@ -1367,12 +1407,25 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, struct exp
 		struct used_atom *atom = &used_atom[i];
 		const char *name = atom->name;
 		struct atom_value *v = &val[i];
+		enum atom_type atom_type = atom->atom_type;
 
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
 
+		if (atom_type == ATOM_RAW) {
+			unsigned long buf_size = data->size;
+
+			if (atom->u.raw_data.option == RAW_BARE) {
+				v->s = xmemdupz(buf, buf_size);
+				v->s_size = buf_size;
+			} else if (atom->u.raw_data.option == RAW_LENGTH) {
+				v->s = xstrfmt("%"PRIuMAX, (uintmax_t)buf_size);
+			}
+			continue;
+		}
+
 		if ((data->type != OBJ_TAG &&
 		     data->type != OBJ_COMMIT) ||
 		    (strcmp(name, "body") &&
@@ -1460,9 +1513,11 @@ static void grab_values(struct atom_value *val, int deref, struct object *obj, s
 		break;
 	case OBJ_TREE:
 		/* grab_tree_values(val, deref, obj, buf, sz); */
+		grab_sub_body_contents(val, deref, data);
 		break;
 	case OBJ_BLOB:
 		/* grab_blob_values(val, deref, obj, buf, sz); */
+		grab_sub_body_contents(val, deref, data);
 		break;
 	default:
 		die("Eh?  Object of type %d?", obj->type);
@@ -1766,6 +1821,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 		const char *refname;
 		struct branch *branch = NULL;
 
+		v->s_size = ATOM_SIZE_UNSPECIFIED;
 		v->handler = append_atom;
 		v->atom = atom;
 
@@ -2369,6 +2425,19 @@ static int compare_detached_head(struct ref_array_item *a, struct ref_array_item
 	return 0;
 }
 
+static int memcasecmp(const void *vs1, const void *vs2, size_t n)
+{
+	const char *s1 = vs1, *s2 = vs2;
+	const char *end = s1 + n;
+
+	for (; s1 < end; s1++, s2++) {
+		int diff = tolower(*s1) - tolower(*s2);
+		if (diff)
+			return diff;
+	}
+	return 0;
+}
+
 static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, struct ref_array_item *b)
 {
 	struct atom_value *va, *vb;
@@ -2389,10 +2458,29 @@ static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, stru
 	} else if (s->sort_flags & REF_SORTING_VERSION) {
 		cmp = versioncmp(va->s, vb->s);
 	} else if (cmp_type == FIELD_STR) {
-		int (*cmp_fn)(const char *, const char *);
-		cmp_fn = s->sort_flags & REF_SORTING_ICASE
-			? strcasecmp : strcmp;
-		cmp = cmp_fn(va->s, vb->s);
+		if (va->s_size == ATOM_SIZE_UNSPECIFIED && vb->s_size == ATOM_SIZE_UNSPECIFIED) {
+			int (*cmp_fn)(const char *, const char *);
+			cmp_fn = s->sort_flags & REF_SORTING_ICASE
+				? strcasecmp : strcmp;
+			cmp = cmp_fn(va->s, vb->s);
+		} else {
+			size_t a_size = va->s_size == ATOM_SIZE_UNSPECIFIED ?
+					strlen(va->s) : va->s_size;
+			size_t b_size = vb->s_size == ATOM_SIZE_UNSPECIFIED ?
+					strlen(vb->s) : vb->s_size;
+			int (*cmp_fn)(const void *, const void *, size_t);
+			cmp_fn = s->sort_flags & REF_SORTING_ICASE
+				? memcasecmp : memcmp;
+
+			cmp = cmp_fn(va->s, vb->s, b_size > a_size ?
+				     a_size : b_size);
+			if (!cmp) {
+				if (a_size > b_size)
+					cmp = 1;
+				else if (a_size < b_size)
+					cmp = -1;
+			}
+		}
 	} else {
 		if (va->value < vb->value)
 			cmp = -1;
@@ -2491,7 +2579,7 @@ int format_ref_array_item(struct ref_array_item *info,
 		append_literal(cp, sp, &state);
 	}
 	if (format->need_color_reset_at_eol) {
-		struct atom_value resetv;
+		struct atom_value resetv = ATOM_VALUE_INIT;
 		resetv.s = GIT_COLOR_RESET;
 		if (append_atom(&resetv, &state, error_buf)) {
 			pop_stack_element(&state.stack);
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 9e0214076b4..18554f62d94 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -130,6 +130,8 @@ test_atom head parent:short=10 ''
 test_atom head numparent 0
 test_atom head object ''
 test_atom head type ''
+test_atom head raw "$(git cat-file commit refs/heads/main)
+"
 test_atom head '*objectname' ''
 test_atom head '*objecttype' ''
 test_atom head author 'A U Thor <author@example.com> 1151968724 +0200'
@@ -221,6 +223,15 @@ test_atom tag contents 'Tagging at 1151968727
 '
 test_atom tag HEAD ' '
 
+test_expect_success 'basic atom: refs/tags/testtag *raw' '
+	git cat-file commit refs/tags/testtag^{} >expected &&
+	git for-each-ref --format="%(*raw)" refs/tags/testtag >actual &&
+	sanitize_pgp <expected >expected.clean &&
+	echo >>expected.clean &&
+	sanitize_pgp <actual >actual.clean &&
+	test_cmp expected.clean actual.clean
+'
+
 test_expect_success 'Check invalid atoms names are errors' '
 	test_must_fail git for-each-ref --format="%(INVALID)" refs/heads
 '
@@ -686,6 +697,15 @@ test_atom refs/tags/signed-empty contents:body ''
 test_atom refs/tags/signed-empty contents:signature "$sig"
 test_atom refs/tags/signed-empty contents "$sig"
 
+test_expect_success GPG 'basic atom: refs/tags/signed-empty raw' '
+	git cat-file tag refs/tags/signed-empty >expected &&
+	git for-each-ref --format="%(raw)" refs/tags/signed-empty >actual &&
+	sanitize_pgp <expected >expected.clean &&
+	echo >>expected.clean &&
+	sanitize_pgp <actual >actual.clean &&
+	test_cmp expected.clean actual.clean
+'
+
 test_atom refs/tags/signed-short subject 'subject line'
 test_atom refs/tags/signed-short subject:sanitize 'subject-line'
 test_atom refs/tags/signed-short contents:subject 'subject line'
@@ -695,6 +715,15 @@ test_atom refs/tags/signed-short contents:signature "$sig"
 test_atom refs/tags/signed-short contents "subject line
 $sig"
 
+test_expect_success GPG 'basic atom: refs/tags/signed-short raw' '
+	git cat-file tag refs/tags/signed-short >expected &&
+	git for-each-ref --format="%(raw)" refs/tags/signed-short >actual &&
+	sanitize_pgp <expected >expected.clean &&
+	echo >>expected.clean &&
+	sanitize_pgp <actual >actual.clean &&
+	test_cmp expected.clean actual.clean
+'
+
 test_atom refs/tags/signed-long subject 'subject line'
 test_atom refs/tags/signed-long subject:sanitize 'subject-line'
 test_atom refs/tags/signed-long contents:subject 'subject line'
@@ -708,6 +737,15 @@ test_atom refs/tags/signed-long contents "subject line
 body contents
 $sig"
 
+test_expect_success GPG 'basic atom: refs/tags/signed-long raw' '
+	git cat-file tag refs/tags/signed-long >expected &&
+	git for-each-ref --format="%(raw)" refs/tags/signed-long >actual &&
+	sanitize_pgp <expected >expected.clean &&
+	echo >>expected.clean &&
+	sanitize_pgp <actual >actual.clean &&
+	test_cmp expected.clean actual.clean
+'
+
 test_expect_success 'set up refs pointing to tree and blob' '
 	git update-ref refs/mytrees/first refs/heads/main^{tree} &&
 	git update-ref refs/myblobs/first refs/heads/main:one
@@ -720,6 +758,16 @@ test_atom refs/mytrees/first contents:body ""
 test_atom refs/mytrees/first contents:signature ""
 test_atom refs/mytrees/first contents ""
 
+test_expect_success 'basic atom: refs/mytrees/first raw' '
+	git cat-file tree refs/mytrees/first >expected &&
+	echo >>expected &&
+	git for-each-ref --format="%(raw)" refs/mytrees/first >actual &&
+	test_cmp expected actual &&
+	git cat-file -s refs/mytrees/first >expected &&
+	git for-each-ref --format="%(raw:size)" refs/mytrees/first >actual &&
+	test_cmp expected actual
+'
+
 test_atom refs/myblobs/first subject ""
 test_atom refs/myblobs/first contents:subject ""
 test_atom refs/myblobs/first body ""
@@ -727,6 +775,174 @@ test_atom refs/myblobs/first contents:body ""
 test_atom refs/myblobs/first contents:signature ""
 test_atom refs/myblobs/first contents ""
 
+test_expect_success 'basic atom: refs/myblobs/first raw' '
+	git cat-file blob refs/myblobs/first >expected &&
+	echo >>expected &&
+	git for-each-ref --format="%(raw)" refs/myblobs/first >actual &&
+	test_cmp expected actual &&
+	git cat-file -s refs/myblobs/first >expected &&
+	git for-each-ref --format="%(raw:size)" refs/myblobs/first >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'set up refs pointing to binary blob' '
+	printf "a\0b\0c" >blob1 &&
+	printf "a\0c\0b" >blob2 &&
+	printf "\0a\0b\0c" >blob3 &&
+	printf "abc" >blob4 &&
+	printf "\0 \0 \0 " >blob5 &&
+	printf "\0 \0a\0 " >blob6 &&
+	printf "  " >blob7 &&
+	>blob8 &&
+	obj=$(git hash-object -w blob1) &&
+	git update-ref refs/myblobs/blob1 "$obj" &&
+	obj=$(git hash-object -w blob2) &&
+	git update-ref refs/myblobs/blob2 "$obj" &&
+	obj=$(git hash-object -w blob3) &&
+	git update-ref refs/myblobs/blob3 "$obj" &&
+	obj=$(git hash-object -w blob4) &&
+	git update-ref refs/myblobs/blob4 "$obj" &&
+	obj=$(git hash-object -w blob5) &&
+	git update-ref refs/myblobs/blob5 "$obj" &&
+	obj=$(git hash-object -w blob6) &&
+	git update-ref refs/myblobs/blob6 "$obj" &&
+	obj=$(git hash-object -w blob7) &&
+	git update-ref refs/myblobs/blob7 "$obj" &&
+	obj=$(git hash-object -w blob8) &&
+	git update-ref refs/myblobs/blob8 "$obj"
+'
+
+test_expect_success 'Verify sorts with raw' '
+	cat >expected <<-EOF &&
+	refs/myblobs/blob8
+	refs/myblobs/blob5
+	refs/myblobs/blob6
+	refs/myblobs/blob3
+	refs/myblobs/blob7
+	refs/mytrees/first
+	refs/myblobs/first
+	refs/myblobs/blob1
+	refs/myblobs/blob2
+	refs/myblobs/blob4
+	refs/heads/main
+	EOF
+	git for-each-ref --format="%(refname)" --sort=raw \
+		refs/heads/main refs/myblobs/ refs/mytrees/first >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'Verify sorts with raw:size' '
+	cat >expected <<-EOF &&
+	refs/myblobs/blob8
+	refs/myblobs/first
+	refs/myblobs/blob7
+	refs/heads/main
+	refs/myblobs/blob4
+	refs/myblobs/blob1
+	refs/myblobs/blob2
+	refs/myblobs/blob3
+	refs/myblobs/blob5
+	refs/myblobs/blob6
+	refs/mytrees/first
+	EOF
+	git for-each-ref --format="%(refname)" --sort=raw:size \
+		refs/heads/main refs/myblobs/ refs/mytrees/first >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'validate raw atom with %(if:equals)' '
+	cat >expected <<-EOF &&
+	not equals
+	not equals
+	not equals
+	not equals
+	not equals
+	not equals
+	refs/myblobs/blob4
+	not equals
+	not equals
+	not equals
+	not equals
+	not equals
+	EOF
+	git for-each-ref --format="%(if:equals=abc)%(raw)%(then)%(refname)%(else)not equals%(end)" \
+		refs/myblobs/ refs/heads/ >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'validate raw atom with %(if:notequals)' '
+	cat >expected <<-EOF &&
+	refs/heads/ambiguous
+	refs/heads/main
+	refs/heads/newtag
+	refs/myblobs/blob1
+	refs/myblobs/blob2
+	refs/myblobs/blob3
+	equals
+	refs/myblobs/blob5
+	refs/myblobs/blob6
+	refs/myblobs/blob7
+	refs/myblobs/blob8
+	refs/myblobs/first
+	EOF
+	git for-each-ref --format="%(if:notequals=abc)%(raw)%(then)%(refname)%(else)equals%(end)" \
+		refs/myblobs/ refs/heads/ >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'empty raw refs with %(if)' '
+	cat >expected <<-EOF &&
+	refs/myblobs/blob1 not empty
+	refs/myblobs/blob2 not empty
+	refs/myblobs/blob3 not empty
+	refs/myblobs/blob4 not empty
+	refs/myblobs/blob5 not empty
+	refs/myblobs/blob6 not empty
+	refs/myblobs/blob7 empty
+	refs/myblobs/blob8 empty
+	refs/myblobs/first not empty
+	EOF
+	git for-each-ref --format="%(refname) %(if)%(raw)%(then)not empty%(else)empty%(end)" \
+		refs/myblobs/ >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success '%(raw) with --python must fail' '
+	test_must_fail git for-each-ref --format="%(raw)" --python
+'
+
+test_expect_success '%(raw) with --tcl must fail' '
+	test_must_fail git for-each-ref --format="%(raw)" --tcl
+'
+
+test_expect_success '%(raw) with --perl must fail' '
+	test_must_fail git for-each-ref --format="%(raw)" --perl
+'
+
+test_expect_success '%(raw) with --shell must fail' '
+	test_must_fail git for-each-ref --format="%(raw)" --shell
+'
+
+test_expect_success '%(raw) with --shell and --sort=raw must fail' '
+	test_must_fail git for-each-ref --format="%(raw)" --sort=raw --shell
+'
+
+test_expect_success '%(raw:size) with --shell' '
+	git for-each-ref --format="%(raw:size)" | while read line
+	do
+		echo "'\''$line'\''" >>expect
+	done &&
+	git for-each-ref --format="%(raw:size)" --shell >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'for-each-ref --format compare with cat-file --batch' '
+	git rev-parse refs/mytrees/first | git cat-file --batch >expected &&
+	git for-each-ref --format="%(objectname) %(objecttype) %(objectsize)
+%(raw)" refs/mytrees/first >actual &&
+	test_cmp expected actual
+'
+
 test_expect_success 'set up multiple-sort tags' '
 	for when in 100000 200000
 	do
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v3 3/5] [GSOC] ref-filter: --format=%(raw) re-support --perl
  2021-07-24 14:14   ` [PATCH v3 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms ZheNing Hu via GitGitGadget
  2021-07-24 14:14     ` [PATCH v3 1/5] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
  2021-07-24 14:14     ` [PATCH v3 2/5] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
@ 2021-07-24 14:14     ` ZheNing Hu via GitGitGadget
  2021-07-25  8:27       ` Jacob Keller
  2021-07-24 14:14     ` [PATCH v3 4/5] [GSOC] ref-filter: use non-const ref_format in *_atom_parser() ZheNing Hu via GitGitGadget
                       ` (2 subsequent siblings)
  5 siblings, 1 reply; 45+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-07-24 14:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
	Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, Jacob Keller, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Because the perl language can handle binary data correctly,
add the function perl_quote_buf_with_len(), which can specify
the length of the data and prevent the data from being truncated
at '\0' to help `--format="%(raw)"` re-support `--perl`.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 Documentation/git-for-each-ref.txt |  4 ++--
 quote.c                            | 17 +++++++++++++++++
 quote.h                            |  1 +
 ref-filter.c                       | 15 +++++++++++----
 t/t6300-for-each-ref.sh            | 19 +++++++++++++++++--
 5 files changed, 48 insertions(+), 8 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index cbb6f87d13f..6da899c6296 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -241,8 +241,8 @@ raw:size::
 	The raw data size of the object.
 
 Note that `--format=%(raw)` can not be used with `--python`, `--shell`, `--tcl`,
-`--perl` because such language may not support arbitrary binary data in their
-string variable type.
+because such language may not support arbitrary binary data in their string
+variable type.
 
 The message in a commit or a tag object is `contents`, from which
 `contents:<part>` can be used to extract various parts out of:
diff --git a/quote.c b/quote.c
index 8a3a5e39eb1..26719d21d1e 100644
--- a/quote.c
+++ b/quote.c
@@ -471,6 +471,23 @@ void perl_quote_buf(struct strbuf *sb, const char *src)
 	strbuf_addch(sb, sq);
 }
 
+void perl_quote_buf_with_len(struct strbuf *sb, const char *src, size_t len)
+{
+	const char sq = '\'';
+	const char bq = '\\';
+	const char *c = src;
+	const char *end = src + len;
+
+	strbuf_addch(sb, sq);
+	while (c != end) {
+		if (*c == sq || *c == bq)
+			strbuf_addch(sb, bq);
+		strbuf_addch(sb, *c);
+		c++;
+	}
+	strbuf_addch(sb, sq);
+}
+
 void python_quote_buf(struct strbuf *sb, const char *src)
 {
 	const char sq = '\'';
diff --git a/quote.h b/quote.h
index 768cc6338e2..0fe69e264b0 100644
--- a/quote.h
+++ b/quote.h
@@ -94,6 +94,7 @@ char *quote_path(const char *in, const char *prefix, struct strbuf *out, unsigne
 
 /* quoting as a string literal for other languages */
 void perl_quote_buf(struct strbuf *sb, const char *src);
+void perl_quote_buf_with_len(struct strbuf *sb, const char *src, size_t len);
 void python_quote_buf(struct strbuf *sb, const char *src);
 void tcl_quote_buf(struct strbuf *sb, const char *src);
 void basic_regex_quote_buf(struct strbuf *sb, const char *src);
diff --git a/ref-filter.c b/ref-filter.c
index 01545618101..597354c941d 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -746,7 +746,10 @@ static void quote_formatting(struct strbuf *s, const char *str, size_t len, int
 		sq_quote_buf(s, str);
 		break;
 	case QUOTE_PERL:
-		perl_quote_buf(s, str);
+		if (len != -1)
+			perl_quote_buf_with_len(s, str, len);
+		else
+			perl_quote_buf(s, str);
 		break;
 	case QUOTE_PYTHON:
 		python_quote_buf(s, str);
@@ -1010,10 +1013,14 @@ int verify_ref_format(struct ref_format *format)
 		at = parse_ref_filter_atom(format, sp + 2, ep, &err);
 		if (at < 0)
 			die("%s", err.buf);
-		if (format->quote_style && used_atom[at].atom_type == ATOM_RAW &&
-		    used_atom[at].u.raw_data.option == RAW_BARE)
+
+		if ((format->quote_style == QUOTE_PYTHON ||
+		     format->quote_style == QUOTE_SHELL ||
+		     format->quote_style == QUOTE_TCL) &&
+		     used_atom[at].atom_type == ATOM_RAW &&
+		     used_atom[at].u.raw_data.option == RAW_BARE)
 			die(_("--format=%.*s cannot be used with"
-			      "--python, --shell, --tcl, --perl"), (int)(ep - sp - 2), sp + 2);
+			      "--python, --shell, --tcl"), (int)(ep - sp - 2), sp + 2);
 		cp = ep + 1;
 
 		if (skip_prefix(used_atom[at].name, "color:", &color))
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 18554f62d94..3d15d0a5360 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -915,8 +915,23 @@ test_expect_success '%(raw) with --tcl must fail' '
 	test_must_fail git for-each-ref --format="%(raw)" --tcl
 '
 
-test_expect_success '%(raw) with --perl must fail' '
-	test_must_fail git for-each-ref --format="%(raw)" --perl
+test_expect_success '%(raw) with --perl' '
+	git for-each-ref --format="\$name= %(raw);
+print \"\$name\"" refs/myblobs/blob1 --perl | perl >actual &&
+	cmp blob1 actual &&
+	git for-each-ref --format="\$name= %(raw);
+print \"\$name\"" refs/myblobs/blob3 --perl | perl >actual &&
+	cmp blob3 actual &&
+	git for-each-ref --format="\$name= %(raw);
+print \"\$name\"" refs/myblobs/blob8 --perl | perl >actual &&
+	cmp blob8 actual &&
+	git for-each-ref --format="\$name= %(raw);
+print \"\$name\"" refs/myblobs/first --perl | perl >actual &&
+	cmp one actual &&
+	git cat-file tree refs/mytrees/first > expected &&
+	git for-each-ref --format="\$name= %(raw);
+print \"\$name\"" refs/mytrees/first --perl | perl >actual &&
+	cmp expected actual
 '
 
 test_expect_success '%(raw) with --shell must fail' '
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v3 4/5] [GSOC] ref-filter: use non-const ref_format in *_atom_parser()
  2021-07-24 14:14   ` [PATCH v3 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms ZheNing Hu via GitGitGadget
                       ` (2 preceding siblings ...)
  2021-07-24 14:14     ` [PATCH v3 3/5] [GSOC] ref-filter: --format=%(raw) re-support --perl ZheNing Hu via GitGitGadget
@ 2021-07-24 14:14     ` ZheNing Hu via GitGitGadget
  2021-07-24 14:14     ` [PATCH v3 5/5] [GSOC] ref-filter: add %(rest) atom ZheNing Hu via GitGitGadget
  2021-07-26  3:26     ` [PATCH v4 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms ZheNing Hu via GitGitGadget
  5 siblings, 0 replies; 45+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-07-24 14:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
	Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, Jacob Keller, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Use non-const ref_format in *_atom_parser(), which can help us
modify the members of ref_format in *_atom_parser().

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 builtin/tag.c |  2 +-
 ref-filter.c  | 44 ++++++++++++++++++++++----------------------
 ref-filter.h  |  4 ++--
 3 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/builtin/tag.c b/builtin/tag.c
index 82fcfc09824..452558ec957 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -146,7 +146,7 @@ static int verify_tag(const char *name, const char *ref,
 		      const struct object_id *oid, void *cb_data)
 {
 	int flags;
-	const struct ref_format *format = cb_data;
+	struct ref_format *format = cb_data;
 	flags = GPG_VERIFY_VERBOSE;
 
 	if (format->format)
diff --git a/ref-filter.c b/ref-filter.c
index 597354c941d..7cef839aaf1 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -226,7 +226,7 @@ static int strbuf_addf_ret(struct strbuf *sb, int ret, const char *fmt, ...)
 	return ret;
 }
 
-static int color_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int color_atom_parser(struct ref_format *format, struct used_atom *atom,
 			     const char *color_value, struct strbuf *err)
 {
 	if (!color_value)
@@ -264,7 +264,7 @@ static int refname_atom_parser_internal(struct refname_atom *atom, const char *a
 	return 0;
 }
 
-static int remote_ref_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int remote_ref_atom_parser(struct ref_format *format, struct used_atom *atom,
 				  const char *arg, struct strbuf *err)
 {
 	struct string_list params = STRING_LIST_INIT_DUP;
@@ -311,7 +311,7 @@ static int remote_ref_atom_parser(const struct ref_format *format, struct used_a
 	return 0;
 }
 
-static int objecttype_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int objecttype_atom_parser(struct ref_format *format, struct used_atom *atom,
 				  const char *arg, struct strbuf *err)
 {
 	if (arg)
@@ -323,7 +323,7 @@ static int objecttype_atom_parser(const struct ref_format *format, struct used_a
 	return 0;
 }
 
-static int objectsize_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int objectsize_atom_parser(struct ref_format *format, struct used_atom *atom,
 				  const char *arg, struct strbuf *err)
 {
 	if (!arg) {
@@ -343,7 +343,7 @@ static int objectsize_atom_parser(const struct ref_format *format, struct used_a
 	return 0;
 }
 
-static int deltabase_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int deltabase_atom_parser(struct ref_format *format, struct used_atom *atom,
 				 const char *arg, struct strbuf *err)
 {
 	if (arg)
@@ -355,7 +355,7 @@ static int deltabase_atom_parser(const struct ref_format *format, struct used_at
 	return 0;
 }
 
-static int body_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int body_atom_parser(struct ref_format *format, struct used_atom *atom,
 			    const char *arg, struct strbuf *err)
 {
 	if (arg)
@@ -364,7 +364,7 @@ static int body_atom_parser(const struct ref_format *format, struct used_atom *a
 	return 0;
 }
 
-static int subject_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int subject_atom_parser(struct ref_format *format, struct used_atom *atom,
 			       const char *arg, struct strbuf *err)
 {
 	if (!arg)
@@ -376,7 +376,7 @@ static int subject_atom_parser(const struct ref_format *format, struct used_atom
 	return 0;
 }
 
-static int trailers_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int trailers_atom_parser(struct ref_format *format, struct used_atom *atom,
 				const char *arg, struct strbuf *err)
 {
 	atom->u.contents.trailer_opts.no_divider = 1;
@@ -402,7 +402,7 @@ static int trailers_atom_parser(const struct ref_format *format, struct used_ato
 	return 0;
 }
 
-static int contents_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int contents_atom_parser(struct ref_format *format, struct used_atom *atom,
 				const char *arg, struct strbuf *err)
 {
 	if (!arg)
@@ -430,7 +430,7 @@ static int contents_atom_parser(const struct ref_format *format, struct used_ato
 	return 0;
 }
 
-static int raw_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int raw_atom_parser(struct ref_format *format, struct used_atom *atom,
 				const char *arg, struct strbuf *err)
 {
 	if (!arg)
@@ -442,7 +442,7 @@ static int raw_atom_parser(const struct ref_format *format, struct used_atom *at
 	return 0;
 }
 
-static int oid_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int oid_atom_parser(struct ref_format *format, struct used_atom *atom,
 			   const char *arg, struct strbuf *err)
 {
 	if (!arg)
@@ -461,7 +461,7 @@ static int oid_atom_parser(const struct ref_format *format, struct used_atom *at
 	return 0;
 }
 
-static int person_email_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int person_email_atom_parser(struct ref_format *format, struct used_atom *atom,
 				    const char *arg, struct strbuf *err)
 {
 	if (!arg)
@@ -475,7 +475,7 @@ static int person_email_atom_parser(const struct ref_format *format, struct used
 	return 0;
 }
 
-static int refname_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int refname_atom_parser(struct ref_format *format, struct used_atom *atom,
 			       const char *arg, struct strbuf *err)
 {
 	return refname_atom_parser_internal(&atom->u.refname, arg, atom->name, err);
@@ -492,7 +492,7 @@ static align_type parse_align_position(const char *s)
 	return -1;
 }
 
-static int align_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int align_atom_parser(struct ref_format *format, struct used_atom *atom,
 			     const char *arg, struct strbuf *err)
 {
 	struct align *align = &atom->u.align;
@@ -544,7 +544,7 @@ static int align_atom_parser(const struct ref_format *format, struct used_atom *
 	return 0;
 }
 
-static int if_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int if_atom_parser(struct ref_format *format, struct used_atom *atom,
 			  const char *arg, struct strbuf *err)
 {
 	if (!arg) {
@@ -559,7 +559,7 @@ static int if_atom_parser(const struct ref_format *format, struct used_atom *ato
 	return 0;
 }
 
-static int head_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int head_atom_parser(struct ref_format *format, struct used_atom *atom,
 			    const char *arg, struct strbuf *unused_err)
 {
 	atom->u.head = resolve_refdup("HEAD", RESOLVE_REF_READING, NULL, NULL);
@@ -570,7 +570,7 @@ static struct {
 	const char *name;
 	info_source source;
 	cmp_type cmp_type;
-	int (*parser)(const struct ref_format *format, struct used_atom *atom,
+	int (*parser)(struct ref_format *format, struct used_atom *atom,
 		      const char *arg, struct strbuf *err);
 } valid_atom[] = {
 	[ATOM_REFNAME] = { "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser },
@@ -653,7 +653,7 @@ struct atom_value {
 /*
  * Used to parse format string and sort specifiers
  */
-static int parse_ref_filter_atom(const struct ref_format *format,
+static int parse_ref_filter_atom(struct ref_format *format,
 				 const char *atom, const char *ep,
 				 struct strbuf *err)
 {
@@ -2557,9 +2557,9 @@ static void append_literal(const char *cp, const char *ep, struct ref_formatting
 }
 
 int format_ref_array_item(struct ref_array_item *info,
-			   const struct ref_format *format,
-			   struct strbuf *final_buf,
-			   struct strbuf *error_buf)
+			  struct ref_format *format,
+			  struct strbuf *final_buf,
+			  struct strbuf *error_buf)
 {
 	const char *cp, *sp, *ep;
 	struct ref_formatting_state state = REF_FORMATTING_STATE_INIT;
@@ -2603,7 +2603,7 @@ int format_ref_array_item(struct ref_array_item *info,
 }
 
 void pretty_print_ref(const char *name, const struct object_id *oid,
-		      const struct ref_format *format)
+		      struct ref_format *format)
 {
 	struct ref_array_item *ref_item;
 	struct strbuf output = STRBUF_INIT;
diff --git a/ref-filter.h b/ref-filter.h
index baf72a71896..74fb423fc89 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -116,7 +116,7 @@ void ref_array_sort(struct ref_sorting *sort, struct ref_array *array);
 void ref_sorting_set_sort_flags_all(struct ref_sorting *sorting, unsigned int mask, int on);
 /*  Based on the given format and quote_style, fill the strbuf */
 int format_ref_array_item(struct ref_array_item *info,
-			  const struct ref_format *format,
+			  struct ref_format *format,
 			  struct strbuf *final_buf,
 			  struct strbuf *error_buf);
 /*  Parse a single sort specifier and add it to the list */
@@ -137,7 +137,7 @@ void setup_ref_filter_porcelain_msg(void);
  * name must be a fully qualified refname.
  */
 void pretty_print_ref(const char *name, const struct object_id *oid,
-		      const struct ref_format *format);
+		      struct ref_format *format);
 
 /*
  * Push a single ref onto the array; this can be used to construct your own
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v3 5/5] [GSOC] ref-filter: add %(rest) atom
  2021-07-24 14:14   ` [PATCH v3 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms ZheNing Hu via GitGitGadget
                       ` (3 preceding siblings ...)
  2021-07-24 14:14     ` [PATCH v3 4/5] [GSOC] ref-filter: use non-const ref_format in *_atom_parser() ZheNing Hu via GitGitGadget
@ 2021-07-24 14:14     ` ZheNing Hu via GitGitGadget
  2021-07-25  8:29       ` Jacob Keller
  2021-07-26  3:26     ` [PATCH v4 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms ZheNing Hu via GitGitGadget
  5 siblings, 1 reply; 45+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-07-24 14:14 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
	Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, Jacob Keller, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

%(rest) is a atom used for cat-file batch mode, which can split
the input lines at the first whitespace boundary, all characters
before that whitespace are considered to be the object name;
characters after that first run of whitespace (i.e., the "rest"
of the line) are output in place of the %(rest) atom.

In order to let "cat-file --batch=%(rest)" use the ref-filter
interface, add %(rest) atom for ref-filter.

Introduce the reject_atom() to reject the atom %(rest) for
"git for-each-ref", "git branch", "git tag" and "git verify-tag".

Suggected-by: Jacob Keller <jacob.keller@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 ref-filter.c             | 25 +++++++++++++++++++++++++
 ref-filter.h             |  5 ++++-
 t/t3203-branch-output.sh |  4 ++++
 t/t6300-for-each-ref.sh  |  4 ++++
 t/t7004-tag.sh           |  4 ++++
 t/t7030-verify-tag.sh    |  4 ++++
 6 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/ref-filter.c b/ref-filter.c
index 7cef839aaf1..85e9e657fe7 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -157,6 +157,7 @@ enum atom_type {
 	ATOM_IF,
 	ATOM_THEN,
 	ATOM_ELSE,
+	ATOM_REST,
 };
 
 /*
@@ -559,6 +560,15 @@ static int if_atom_parser(struct ref_format *format, struct used_atom *atom,
 	return 0;
 }
 
+static int rest_atom_parser(struct ref_format *format, struct used_atom *atom,
+			    const char *arg, struct strbuf *err)
+{
+	if (arg)
+		return strbuf_addf_ret(err, -1, _("%%(rest) does not take arguments"));
+	format->use_rest = 1;
+	return 0;
+}
+
 static int head_atom_parser(struct ref_format *format, struct used_atom *atom,
 			    const char *arg, struct strbuf *unused_err)
 {
@@ -615,6 +625,7 @@ static struct {
 	[ATOM_IF] = { "if", SOURCE_NONE, FIELD_STR, if_atom_parser },
 	[ATOM_THEN] = { "then", SOURCE_NONE },
 	[ATOM_ELSE] = { "else", SOURCE_NONE },
+	[ATOM_REST] = { "rest", SOURCE_NONE, FIELD_STR, rest_atom_parser },
 	/*
 	 * Please update $__git_ref_fieldlist in git-completion.bash
 	 * when you add new atoms
@@ -993,6 +1004,11 @@ static const char *find_next(const char *cp)
 	return NULL;
 }
 
+static int reject_atom(enum atom_type atom_type)
+{
+	return atom_type == ATOM_REST;
+}
+
 /*
  * Make sure the format string is well formed, and parse out
  * the used atoms.
@@ -1013,6 +1029,8 @@ int verify_ref_format(struct ref_format *format)
 		at = parse_ref_filter_atom(format, sp + 2, ep, &err);
 		if (at < 0)
 			die("%s", err.buf);
+		if (reject_atom(used_atom[at].atom_type))
+			die(_("this command reject atom %%(%.*s)"), (int)(ep - sp - 2), sp + 2);
 
 		if ((format->quote_style == QUOTE_PYTHON ||
 		     format->quote_style == QUOTE_SHELL ||
@@ -1932,6 +1950,12 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 			v->handler = else_atom_handler;
 			v->s = xstrdup("");
 			continue;
+		} else if (atom_type == ATOM_REST) {
+			if (ref->rest)
+				v->s = xstrdup(ref->rest);
+			else
+				v->s = xstrdup("");
+			continue;
 		} else
 			continue;
 
@@ -2149,6 +2173,7 @@ static struct ref_array_item *new_ref_array_item(const char *refname,
 
 	FLEX_ALLOC_STR(ref, refname, refname);
 	oidcpy(&ref->objectname, oid);
+	ref->rest = NULL;
 
 	return ref;
 }
diff --git a/ref-filter.h b/ref-filter.h
index 74fb423fc89..c15dee8d6b9 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -38,6 +38,7 @@ struct ref_sorting {
 
 struct ref_array_item {
 	struct object_id objectname;
+	const char *rest;
 	int flag;
 	unsigned int kind;
 	const char *symref;
@@ -76,14 +77,16 @@ struct ref_format {
 	 * verify_ref_format() afterwards to finalize.
 	 */
 	const char *format;
+	const char *rest;
 	int quote_style;
+	int use_rest;
 	int use_color;
 
 	/* Internal state to ref-filter */
 	int need_color_reset_at_eol;
 };
 
-#define REF_FORMAT_INIT { NULL, 0, -1 }
+#define REF_FORMAT_INIT { .use_color = -1 }
 
 /*  Macros for checking --merged and --no-merged options */
 #define _OPT_MERGED_NO_MERGED(option, filter, h) \
diff --git a/t/t3203-branch-output.sh b/t/t3203-branch-output.sh
index 5325b9f67a0..6e94c6db7b5 100755
--- a/t/t3203-branch-output.sh
+++ b/t/t3203-branch-output.sh
@@ -340,6 +340,10 @@ test_expect_success 'git branch --format option' '
 	test_cmp expect actual
 '
 
+test_expect_success 'git branch with --format=%(rest) must fail' '
+	test_must_fail git branch --format="%(rest)" >actual
+'
+
 test_expect_success 'worktree colors correct' '
 	cat >expect <<-EOF &&
 	* <GREEN>(HEAD detached from fromtag)<RESET>
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 3d15d0a5360..0d2e062f791 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -1211,6 +1211,10 @@ test_expect_success 'basic atom: head contents:trailers' '
 	test_cmp expect actual.clean
 '
 
+test_expect_success 'basic atom: rest must fail' '
+	test_must_fail git for-each-ref --format="%(rest)" refs/heads/main
+'
+
 test_expect_success 'trailer parsing not fooled by --- line' '
 	git commit --allow-empty -F - <<-\EOF &&
 	this is the subject
diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh
index 2f72c5c6883..082be85dffc 100755
--- a/t/t7004-tag.sh
+++ b/t/t7004-tag.sh
@@ -1998,6 +1998,10 @@ test_expect_success '--format should list tags as per format given' '
 	test_cmp expect actual
 '
 
+test_expect_success 'git tag -l with --format="%(rest)" must fail' '
+	test_must_fail git tag -l --format="%(rest)" "v1*"
+'
+
 test_expect_success "set up color tests" '
 	echo "<RED>v1.0<RESET>" >expect.color &&
 	echo "v1.0" >expect.bare &&
diff --git a/t/t7030-verify-tag.sh b/t/t7030-verify-tag.sh
index 3cefde9602b..10faa645157 100755
--- a/t/t7030-verify-tag.sh
+++ b/t/t7030-verify-tag.sh
@@ -194,6 +194,10 @@ test_expect_success GPG 'verifying tag with --format' '
 	test_cmp expect actual
 '
 
+test_expect_success GPG 'verifying tag with --format="%(rest)" must fail' '
+	test_must_fail git verify-tag --format="%(rest)" "fourth-signed"
+'
+
 test_expect_success GPG 'verifying a forged tag with --format should fail silently' '
 	test_must_fail git verify-tag --format="tagname : %(tag)" $(cat forged1.tag) >actual-forged &&
 	test_must_be_empty actual-forged
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 2/5] [GSOC] ref-filter: add %(raw) atom
  2021-07-24  7:57       ` ZheNing Hu
@ 2021-07-24 17:41         ` Junio C Hamano
  2021-07-25 12:47           ` ZheNing Hu
  0 siblings, 1 reply; 45+ messages in thread
From: Junio C Hamano @ 2021-07-24 17:41 UTC (permalink / raw)
  To: ZheNing Hu
  Cc: ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Hariom Verma, Bagas Sanjaya, Jeff King,
	Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley

ZheNing Hu <adlternative@gmail.com> writes:

>> It may make sense to
>>
>>  * Turn atom_value.s_size field into ssize_t instead of size_t
>>
>
> Will the conversion of size_t and ssize_t break -Wsign-conversion?
> Although there is a lot of code in Git that has broken it, but I am not
> sure if it is wise to use ssize_t here.
>
>>  * Rewrite (v->s_size != -1) comparison to (v->s_size < 0)
>>
>
> For size_t, this scheme is wrong.
>
>>
>> Optionally we could introduce #define ATOM_SIZE_UNSPECIFIED (-1) and
>> use it to initialize .s_size in ATOM_VALUE_INIT, but if we are just
>> going to test "is it negative, then it is not given", then it probably
>> is not needed.
>>
>
> It seems that this is the only method left. Although I think
> ATOM_SIZE_UNSPECIFIED
> is not very useful becasue we already have ATOM_VALUE_INIT.

Sorry, but I think you misread what I wrote.

These three were not offered as "you can do one of these three, pick
one you like" choices.  I meant to say "I think it makes sense to do
all these three things, but the last one is optional, doing only the
first two may be good enough".  As the second one requires that the
first is done, of course, doing only the second one would not make
sense.

Also, you seem to have missed the distinction between _INIT and
_UNSPECIFIED.  You can initialize something to (1) a reasonable
default value, or (2) unusable value that you can detect at runtime
and notice that it was not set.  If you called something to
FOO_INIT, your readers cannot tell which one it is, but if you call
it FOO_UNSPECIFIED, it is clear it is the latter case.

In many places, we do use ssize_t for "normally this is size, but we
can express exception cases (like errors) by storing negative value"
in our codebase (grep for it), and I think the member in question is
prime candidate for such use.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 1/5] [GSOC] ref-filter: add obj-type check in grab contents
  2021-07-24 14:14     ` [PATCH v3 1/5] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
@ 2021-07-25  8:22       ` Jacob Keller
  0 siblings, 0 replies; 45+ messages in thread
From: Jacob Keller @ 2021-07-25  8:22 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: Git mailing list, Junio C Hamano, Christian Couder, Hariom Verma,
	Bagas Sanjaya, Jeff King, Ævar Arnfjörð Bjarmason,
	Eric Sunshine, Philip Oakley, ZheNing Hu

On Sat, Jul 24, 2021 at 7:14 AM ZheNing Hu via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: ZheNing Hu <adlternative@gmail.com>
>
> Only tag and commit objects use `grab_sub_body_contents()` to grab
> object contents in the current codebase.  We want to teach the
> function to also handle blobs and trees to get their raw data,
> without parsing a blob (whose contents looks like a commit or a tag)
> incorrectly as a commit or a tag. So it's needed to pass a
> `struct expand_data *data` instread of only `void *buf` to both
> `grab_sub_body_contents()` and `grab_values()` to be able to check
> the object type.
>
> Skip the block of code that is specific to handling commits and tags
> early when the given object is of a wrong type to help later
> addition to handle other types of objects in this function.
>
> Mentored-by: Christian Couder <christian.couder@gmail.com>
> Mentored-by: Hariom Verma <hariom18599@gmail.com>
> Helped-by: Junio C Hamano <gitster@pobox.com>
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---

Nice. The addition of changing the type made sense and is spelled out nicely.

Reviewed-by: Jacob Keller <jacob.keller@gmail.com>

>  ref-filter.c | 24 +++++++++++++++---------
>  1 file changed, 15 insertions(+), 9 deletions(-)
>
> diff --git a/ref-filter.c b/ref-filter.c
> index 4db0e40ff4c..5cee6512fba 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -1356,11 +1356,12 @@ static void append_lines(struct strbuf *out, const char *buf, unsigned long size
>  }
>
>  /* See grab_values */
> -static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
> +static void grab_sub_body_contents(struct atom_value *val, int deref, struct expand_data *data)
>  {
>         int i;
>         const char *subpos = NULL, *bodypos = NULL, *sigpos = NULL;
>         size_t sublen = 0, bodylen = 0, nonsiglen = 0, siglen = 0;
> +       void *buf = data->content;
>
>         for (i = 0; i < used_atom_cnt; i++) {
>                 struct used_atom *atom = &used_atom[i];
> @@ -1371,10 +1372,13 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
>                         continue;
>                 if (deref)
>                         name++;
> -               if (strcmp(name, "body") &&
> -                   !starts_with(name, "subject") &&
> -                   !starts_with(name, "trailers") &&
> -                   !starts_with(name, "contents"))
> +
> +               if ((data->type != OBJ_TAG &&
> +                    data->type != OBJ_COMMIT) ||
> +                   (strcmp(name, "body") &&
> +                    !starts_with(name, "subject") &&
> +                    !starts_with(name, "trailers") &&
> +                    !starts_with(name, "contents")))
>                         continue;
>                 if (!subpos)
>                         find_subpos(buf,
> @@ -1438,17 +1442,19 @@ static void fill_missing_values(struct atom_value *val)
>   * pointed at by the ref itself; otherwise it is the object the
>   * ref (which is a tag) refers to.
>   */
> -static void grab_values(struct atom_value *val, int deref, struct object *obj, void *buf)
> +static void grab_values(struct atom_value *val, int deref, struct object *obj, struct expand_data *data)
>  {
> +       void *buf = data->content;
> +
>         switch (obj->type) {
>         case OBJ_TAG:
>                 grab_tag_values(val, deref, obj);
> -               grab_sub_body_contents(val, deref, buf);
> +               grab_sub_body_contents(val, deref, data);
>                 grab_person("tagger", val, deref, buf);
>                 break;
>         case OBJ_COMMIT:
>                 grab_commit_values(val, deref, obj);
> -               grab_sub_body_contents(val, deref, buf);
> +               grab_sub_body_contents(val, deref, data);
>                 grab_person("author", val, deref, buf);
>                 grab_person("committer", val, deref, buf);
>                 break;
> @@ -1678,7 +1684,7 @@ static int get_object(struct ref_array_item *ref, int deref, struct object **obj
>                         return strbuf_addf_ret(err, -1, _("parse_object_buffer failed on %s for %s"),
>                                                oid_to_hex(&oi->oid), ref->refname);
>                 }
> -               grab_values(ref->value, deref, *obj, oi->content);
> +               grab_values(ref->value, deref, *obj, oi);
>         }
>
>         grab_common_values(ref->value, deref, oi);
> --
> gitgitgadget
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 3/5] [GSOC] ref-filter: --format=%(raw) re-support --perl
  2021-07-24 14:14     ` [PATCH v3 3/5] [GSOC] ref-filter: --format=%(raw) re-support --perl ZheNing Hu via GitGitGadget
@ 2021-07-25  8:27       ` Jacob Keller
  2021-07-25 13:18         ` ZheNing Hu
  0 siblings, 1 reply; 45+ messages in thread
From: Jacob Keller @ 2021-07-25  8:27 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: Git mailing list, Junio C Hamano, Christian Couder, Hariom Verma,
	Bagas Sanjaya, Jeff King, Ævar Arnfjörð Bjarmason,
	Eric Sunshine, Philip Oakley, ZheNing Hu

On Sat, Jul 24, 2021 at 7:14 AM ZheNing Hu via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: ZheNing Hu <adlternative@gmail.com>
>
> Because the perl language can handle binary data correctly,
> add the function perl_quote_buf_with_len(), which can specify
> the length of the data and prevent the data from being truncated
> at '\0' to help `--format="%(raw)"` re-support `--perl`.
>
> Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---

I presume that we used to support raw with perl before in cat-file? It
did read a bit weird to see "Re-support" in the title, because the
previous patch which introduced the raw atom when in ref-filter code
never supported perl. Still, I think it's fairly clear either way and
that change wouldn't deserve a reroll on its own.

Makes sense. Reviewed-by: Jacob Keller <jacob.keller@gmail.com>

>  Documentation/git-for-each-ref.txt |  4 ++--
>  quote.c                            | 17 +++++++++++++++++
>  quote.h                            |  1 +
>  ref-filter.c                       | 15 +++++++++++----
>  t/t6300-for-each-ref.sh            | 19 +++++++++++++++++--
>  5 files changed, 48 insertions(+), 8 deletions(-)
>
> diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
> index cbb6f87d13f..6da899c6296 100644
> --- a/Documentation/git-for-each-ref.txt
> +++ b/Documentation/git-for-each-ref.txt
> @@ -241,8 +241,8 @@ raw:size::
>         The raw data size of the object.
>
>  Note that `--format=%(raw)` can not be used with `--python`, `--shell`, `--tcl`,
> -`--perl` because such language may not support arbitrary binary data in their
> -string variable type.
> +because such language may not support arbitrary binary data in their string
> +variable type.
>
>  The message in a commit or a tag object is `contents`, from which
>  `contents:<part>` can be used to extract various parts out of:
> diff --git a/quote.c b/quote.c
> index 8a3a5e39eb1..26719d21d1e 100644
> --- a/quote.c
> +++ b/quote.c
> @@ -471,6 +471,23 @@ void perl_quote_buf(struct strbuf *sb, const char *src)
>         strbuf_addch(sb, sq);
>  }
>
> +void perl_quote_buf_with_len(struct strbuf *sb, const char *src, size_t len)
> +{
> +       const char sq = '\'';
> +       const char bq = '\\';
> +       const char *c = src;
> +       const char *end = src + len;
> +
> +       strbuf_addch(sb, sq);
> +       while (c != end) {
> +               if (*c == sq || *c == bq)
> +                       strbuf_addch(sb, bq);
> +               strbuf_addch(sb, *c);
> +               c++;
> +       }
> +       strbuf_addch(sb, sq);
> +}
> +
>  void python_quote_buf(struct strbuf *sb, const char *src)
>  {
>         const char sq = '\'';
> diff --git a/quote.h b/quote.h
> index 768cc6338e2..0fe69e264b0 100644
> --- a/quote.h
> +++ b/quote.h
> @@ -94,6 +94,7 @@ char *quote_path(const char *in, const char *prefix, struct strbuf *out, unsigne
>
>  /* quoting as a string literal for other languages */
>  void perl_quote_buf(struct strbuf *sb, const char *src);
> +void perl_quote_buf_with_len(struct strbuf *sb, const char *src, size_t len);
>  void python_quote_buf(struct strbuf *sb, const char *src);
>  void tcl_quote_buf(struct strbuf *sb, const char *src);
>  void basic_regex_quote_buf(struct strbuf *sb, const char *src);
> diff --git a/ref-filter.c b/ref-filter.c
> index 01545618101..597354c941d 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -746,7 +746,10 @@ static void quote_formatting(struct strbuf *s, const char *str, size_t len, int
>                 sq_quote_buf(s, str);
>                 break;
>         case QUOTE_PERL:
> -               perl_quote_buf(s, str);
> +               if (len != -1)
> +                       perl_quote_buf_with_len(s, str, len);
> +               else
> +                       perl_quote_buf(s, str);
>                 break;
>         case QUOTE_PYTHON:
>                 python_quote_buf(s, str);
> @@ -1010,10 +1013,14 @@ int verify_ref_format(struct ref_format *format)
>                 at = parse_ref_filter_atom(format, sp + 2, ep, &err);
>                 if (at < 0)
>                         die("%s", err.buf);
> -               if (format->quote_style && used_atom[at].atom_type == ATOM_RAW &&
> -                   used_atom[at].u.raw_data.option == RAW_BARE)
> +
> +               if ((format->quote_style == QUOTE_PYTHON ||
> +                    format->quote_style == QUOTE_SHELL ||
> +                    format->quote_style == QUOTE_TCL) &&
> +                    used_atom[at].atom_type == ATOM_RAW &&
> +                    used_atom[at].u.raw_data.option == RAW_BARE)
>                         die(_("--format=%.*s cannot be used with"
> -                             "--python, --shell, --tcl, --perl"), (int)(ep - sp - 2), sp + 2);
> +                             "--python, --shell, --tcl"), (int)(ep - sp - 2), sp + 2);
>                 cp = ep + 1;
>

Right, so now we have to check all of the format quote styles instead
of just checking that its set to any of them, since we want to
continue in the case of perl.

Ok.

>                 if (skip_prefix(used_atom[at].name, "color:", &color))
> diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
> index 18554f62d94..3d15d0a5360 100755
> --- a/t/t6300-for-each-ref.sh
> +++ b/t/t6300-for-each-ref.sh
> @@ -915,8 +915,23 @@ test_expect_success '%(raw) with --tcl must fail' '
>         test_must_fail git for-each-ref --format="%(raw)" --tcl
>  '
>
> -test_expect_success '%(raw) with --perl must fail' '
> -       test_must_fail git for-each-ref --format="%(raw)" --perl
> +test_expect_success '%(raw) with --perl' '
> +       git for-each-ref --format="\$name= %(raw);
> +print \"\$name\"" refs/myblobs/blob1 --perl | perl >actual &&
> +       cmp blob1 actual &&
> +       git for-each-ref --format="\$name= %(raw);
> +print \"\$name\"" refs/myblobs/blob3 --perl | perl >actual &&
> +       cmp blob3 actual &&
> +       git for-each-ref --format="\$name= %(raw);
> +print \"\$name\"" refs/myblobs/blob8 --perl | perl >actual &&
> +       cmp blob8 actual &&
> +       git for-each-ref --format="\$name= %(raw);
> +print \"\$name\"" refs/myblobs/first --perl | perl >actual &&
> +       cmp one actual &&
> +       git cat-file tree refs/mytrees/first > expected &&
> +       git for-each-ref --format="\$name= %(raw);
> +print \"\$name\"" refs/mytrees/first --perl | perl >actual &&
> +       cmp expected actual
>  '
>
>  test_expect_success '%(raw) with --shell must fail' '
> --
> gitgitgadget
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 5/5] [GSOC] ref-filter: add %(rest) atom
  2021-07-24 14:14     ` [PATCH v3 5/5] [GSOC] ref-filter: add %(rest) atom ZheNing Hu via GitGitGadget
@ 2021-07-25  8:29       ` Jacob Keller
  2021-07-26 17:34         ` Junio C Hamano
  0 siblings, 1 reply; 45+ messages in thread
From: Jacob Keller @ 2021-07-25  8:29 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: Git mailing list, Junio C Hamano, Christian Couder, Hariom Verma,
	Bagas Sanjaya, Jeff King, Ævar Arnfjörð Bjarmason,
	Eric Sunshine, Philip Oakley, ZheNing Hu

On Sat, Jul 24, 2021 at 7:14 AM ZheNing Hu via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: ZheNing Hu <adlternative@gmail.com>
>
> %(rest) is a atom used for cat-file batch mode, which can split
> the input lines at the first whitespace boundary, all characters
> before that whitespace are considered to be the object name;
> characters after that first run of whitespace (i.e., the "rest"
> of the line) are output in place of the %(rest) atom.
>
> In order to let "cat-file --batch=%(rest)" use the ref-filter
> interface, add %(rest) atom for ref-filter.
>
> Introduce the reject_atom() to reject the atom %(rest) for
> "git for-each-ref", "git branch", "git tag" and "git verify-tag".
>
> Suggected-by: Jacob Keller <jacob.keller@gmail.com>
> Mentored-by: Christian Couder <christian.couder@gmail.com>
> Mentored-by: Hariom Verma <hariom18599@gmail.com>
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---

Thanks for the improved  commit message here. That helps clarify what
this atom does and aids in understanding the implementation.

Reviewed-by: Jacob Keller <jacob.keller@gmail.com>

>  ref-filter.c             | 25 +++++++++++++++++++++++++
>  ref-filter.h             |  5 ++++-
>  t/t3203-branch-output.sh |  4 ++++
>  t/t6300-for-each-ref.sh  |  4 ++++
>  t/t7004-tag.sh           |  4 ++++
>  t/t7030-verify-tag.sh    |  4 ++++
>  6 files changed, 45 insertions(+), 1 deletion(-)
>
> diff --git a/ref-filter.c b/ref-filter.c
> index 7cef839aaf1..85e9e657fe7 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -157,6 +157,7 @@ enum atom_type {
>         ATOM_IF,
>         ATOM_THEN,
>         ATOM_ELSE,
> +       ATOM_REST,
>  };
>
>  /*
> @@ -559,6 +560,15 @@ static int if_atom_parser(struct ref_format *format, struct used_atom *atom,
>         return 0;
>  }
>
> +static int rest_atom_parser(struct ref_format *format, struct used_atom *atom,
> +                           const char *arg, struct strbuf *err)
> +{
> +       if (arg)
> +               return strbuf_addf_ret(err, -1, _("%%(rest) does not take arguments"));
> +       format->use_rest = 1;
> +       return 0;
> +}
> +
>  static int head_atom_parser(struct ref_format *format, struct used_atom *atom,
>                             const char *arg, struct strbuf *unused_err)
>  {
> @@ -615,6 +625,7 @@ static struct {
>         [ATOM_IF] = { "if", SOURCE_NONE, FIELD_STR, if_atom_parser },
>         [ATOM_THEN] = { "then", SOURCE_NONE },
>         [ATOM_ELSE] = { "else", SOURCE_NONE },
> +       [ATOM_REST] = { "rest", SOURCE_NONE, FIELD_STR, rest_atom_parser },
>         /*
>          * Please update $__git_ref_fieldlist in git-completion.bash
>          * when you add new atoms
> @@ -993,6 +1004,11 @@ static const char *find_next(const char *cp)
>         return NULL;
>  }
>
> +static int reject_atom(enum atom_type atom_type)
> +{
> +       return atom_type == ATOM_REST;
> +}
> +
>  /*
>   * Make sure the format string is well formed, and parse out
>   * the used atoms.
> @@ -1013,6 +1029,8 @@ int verify_ref_format(struct ref_format *format)
>                 at = parse_ref_filter_atom(format, sp + 2, ep, &err);
>                 if (at < 0)
>                         die("%s", err.buf);
> +               if (reject_atom(used_atom[at].atom_type))
> +                       die(_("this command reject atom %%(%.*s)"), (int)(ep - sp - 2), sp + 2);
>
>                 if ((format->quote_style == QUOTE_PYTHON ||
>                      format->quote_style == QUOTE_SHELL ||
> @@ -1932,6 +1950,12 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
>                         v->handler = else_atom_handler;
>                         v->s = xstrdup("");
>                         continue;
> +               } else if (atom_type == ATOM_REST) {
> +                       if (ref->rest)
> +                               v->s = xstrdup(ref->rest);
> +                       else
> +                               v->s = xstrdup("");
> +                       continue;
>                 } else
>                         continue;
>
> @@ -2149,6 +2173,7 @@ static struct ref_array_item *new_ref_array_item(const char *refname,
>
>         FLEX_ALLOC_STR(ref, refname, refname);
>         oidcpy(&ref->objectname, oid);
> +       ref->rest = NULL;
>
>         return ref;
>  }
> diff --git a/ref-filter.h b/ref-filter.h
> index 74fb423fc89..c15dee8d6b9 100644
> --- a/ref-filter.h
> +++ b/ref-filter.h
> @@ -38,6 +38,7 @@ struct ref_sorting {
>
>  struct ref_array_item {
>         struct object_id objectname;
> +       const char *rest;
>         int flag;
>         unsigned int kind;
>         const char *symref;
> @@ -76,14 +77,16 @@ struct ref_format {
>          * verify_ref_format() afterwards to finalize.
>          */
>         const char *format;
> +       const char *rest;
>         int quote_style;
> +       int use_rest;
>         int use_color;
>
>         /* Internal state to ref-filter */
>         int need_color_reset_at_eol;
>  };
>
> -#define REF_FORMAT_INIT { NULL, 0, -1 }
> +#define REF_FORMAT_INIT { .use_color = -1 }
>
>  /*  Macros for checking --merged and --no-merged options */
>  #define _OPT_MERGED_NO_MERGED(option, filter, h) \
> diff --git a/t/t3203-branch-output.sh b/t/t3203-branch-output.sh
> index 5325b9f67a0..6e94c6db7b5 100755
> --- a/t/t3203-branch-output.sh
> +++ b/t/t3203-branch-output.sh
> @@ -340,6 +340,10 @@ test_expect_success 'git branch --format option' '
>         test_cmp expect actual
>  '
>
> +test_expect_success 'git branch with --format=%(rest) must fail' '
> +       test_must_fail git branch --format="%(rest)" >actual
> +'
> +
>  test_expect_success 'worktree colors correct' '
>         cat >expect <<-EOF &&
>         * <GREEN>(HEAD detached from fromtag)<RESET>
> diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
> index 3d15d0a5360..0d2e062f791 100755
> --- a/t/t6300-for-each-ref.sh
> +++ b/t/t6300-for-each-ref.sh
> @@ -1211,6 +1211,10 @@ test_expect_success 'basic atom: head contents:trailers' '
>         test_cmp expect actual.clean
>  '
>
> +test_expect_success 'basic atom: rest must fail' '
> +       test_must_fail git for-each-ref --format="%(rest)" refs/heads/main
> +'
> +
>  test_expect_success 'trailer parsing not fooled by --- line' '
>         git commit --allow-empty -F - <<-\EOF &&
>         this is the subject
> diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh
> index 2f72c5c6883..082be85dffc 100755
> --- a/t/t7004-tag.sh
> +++ b/t/t7004-tag.sh
> @@ -1998,6 +1998,10 @@ test_expect_success '--format should list tags as per format given' '
>         test_cmp expect actual
>  '
>
> +test_expect_success 'git tag -l with --format="%(rest)" must fail' '
> +       test_must_fail git tag -l --format="%(rest)" "v1*"
> +'
> +
>  test_expect_success "set up color tests" '
>         echo "<RED>v1.0<RESET>" >expect.color &&
>         echo "v1.0" >expect.bare &&
> diff --git a/t/t7030-verify-tag.sh b/t/t7030-verify-tag.sh
> index 3cefde9602b..10faa645157 100755
> --- a/t/t7030-verify-tag.sh
> +++ b/t/t7030-verify-tag.sh
> @@ -194,6 +194,10 @@ test_expect_success GPG 'verifying tag with --format' '
>         test_cmp expect actual
>  '
>
> +test_expect_success GPG 'verifying tag with --format="%(rest)" must fail' '
> +       test_must_fail git verify-tag --format="%(rest)" "fourth-signed"
> +'
> +
>  test_expect_success GPG 'verifying a forged tag with --format should fail silently' '
>         test_must_fail git verify-tag --format="tagname : %(tag)" $(cat forged1.tag) >actual-forged &&
>         test_must_be_empty actual-forged
> --
> gitgitgadget

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 2/5] [GSOC] ref-filter: add %(raw) atom
  2021-07-24 17:41         ` Junio C Hamano
@ 2021-07-25 12:47           ` ZheNing Hu
  0 siblings, 0 replies; 45+ messages in thread
From: ZheNing Hu @ 2021-07-25 12:47 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Hariom Verma, Bagas Sanjaya, Jeff King,
	Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley

Junio C Hamano <gitster@pobox.com> 于2021年7月25日周日 上午1:41写道:
>
> ZheNing Hu <adlternative@gmail.com> writes:
>
> >> It may make sense to
> >>
> >>  * Turn atom_value.s_size field into ssize_t instead of size_t
> >>
> >
> > Will the conversion of size_t and ssize_t break -Wsign-conversion?
> > Although there is a lot of code in Git that has broken it, but I am not
> > sure if it is wise to use ssize_t here.
> >
> >>  * Rewrite (v->s_size != -1) comparison to (v->s_size < 0)
> >>
> >
> > For size_t, this scheme is wrong.
> >
> >>
> >> Optionally we could introduce #define ATOM_SIZE_UNSPECIFIED (-1) and
> >> use it to initialize .s_size in ATOM_VALUE_INIT, but if we are just
> >> going to test "is it negative, then it is not given", then it probably
> >> is not needed.
> >>
> >
> > It seems that this is the only method left. Although I think
> > ATOM_SIZE_UNSPECIFIED
> > is not very useful becasue we already have ATOM_VALUE_INIT.
>
> Sorry, but I think you misread what I wrote.
>
> These three were not offered as "you can do one of these three, pick
> one you like" choices.  I meant to say "I think it makes sense to do
> all these three things, but the last one is optional, doing only the
> first two may be good enough".  As the second one requires that the
> first is done, of course, doing only the second one would not make
> sense.
>

OK. I know it now, I will do all of them.

> Also, you seem to have missed the distinction between _INIT and
> _UNSPECIFIED.  You can initialize something to (1) a reasonable
> default value, or (2) unusable value that you can detect at runtime
> and notice that it was not set.  If you called something to
> FOO_INIT, your readers cannot tell which one it is, but if you call
> it FOO_UNSPECIFIED, it is clear it is the latter case.
>

Thanks for clarification, I understand the difference between them now.

> In many places, we do use ssize_t for "normally this is size, but we
> can express exception cases (like errors) by storing negative value"
> in our codebase (grep for it), and I think the member in question is
> prime candidate for such use.
>

Yeah, as abspath.c:137, `ssize_t len` used to distinguish situations
if an error is
returned from strbuf_readlink().

           ssize_t len;
           ...
           len = strbuf_readlink(&symlink, resolved->buf,
                                             st.st_size);
           if (len < 0) {
                               if (flags & REALPATH_DIE_ON_ERROR)
                                       die_errno("Invalid symlink '%s'",
                                                 resolved->buf);
                               else
                                       goto error_out;
                       }
           ...

Or just read() and write(), they return is ssize_t too,
which can return -1 and set errno when an exception occurs.

Thanks.
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 3/5] [GSOC] ref-filter: --format=%(raw) re-support --perl
  2021-07-25  8:27       ` Jacob Keller
@ 2021-07-25 13:18         ` ZheNing Hu
  0 siblings, 0 replies; 45+ messages in thread
From: ZheNing Hu @ 2021-07-25 13:18 UTC (permalink / raw)
  To: Jacob Keller
  Cc: ZheNing Hu via GitGitGadget, Git mailing list, Junio C Hamano,
	Christian Couder, Hariom Verma, Bagas Sanjaya, Jeff King,
	Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley

Jacob Keller <jacob.keller@gmail.com> 于2021年7月25日周日 下午4:27写道:
>
> On Sat, Jul 24, 2021 at 7:14 AM ZheNing Hu via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> >
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > Because the perl language can handle binary data correctly,
> > add the function perl_quote_buf_with_len(), which can specify
> > the length of the data and prevent the data from being truncated
> > at '\0' to help `--format="%(raw)"` re-support `--perl`.
> >
> > Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> > Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> > ---
>
> I presume that we used to support raw with perl before in cat-file? It
> did read a bit weird to see "Re-support" in the title, because the
> previous patch which introduced the raw atom when in ref-filter code
> never supported perl. Still, I think it's fairly clear either way and
> that change wouldn't deserve a reroll on its own.
>

No, we haven't support it. It can be said that this commit is only split from
the previous commit. Therefore, "re-support" is indeed wrong. I will modify it.

> Makes sense. Reviewed-by: Jacob Keller <jacob.keller@gmail.com>
>
> > @@ -1010,10 +1013,14 @@ int verify_ref_format(struct ref_format *format)
> >                 at = parse_ref_filter_atom(format, sp + 2, ep, &err);
> >                 if (at < 0)
> >                         die("%s", err.buf);
> > -               if (format->quote_style && used_atom[at].atom_type == ATOM_RAW &&
> > -                   used_atom[at].u.raw_data.option == RAW_BARE)
> > +
> > +               if ((format->quote_style == QUOTE_PYTHON ||
> > +                    format->quote_style == QUOTE_SHELL ||
> > +                    format->quote_style == QUOTE_TCL) &&
> > +                    used_atom[at].atom_type == ATOM_RAW &&
> > +                    used_atom[at].u.raw_data.option == RAW_BARE)
> >                         die(_("--format=%.*s cannot be used with"
> > -                             "--python, --shell, --tcl, --perl"), (int)(ep - sp - 2), sp + 2);
> > +                             "--python, --shell, --tcl"), (int)(ep - sp - 2), sp + 2);
> >                 cp = ep + 1;
> >
>
> Right, so now we have to check all of the format quote styles instead
> of just checking that its set to any of them, since we want to
> continue in the case of perl.
>
> Ok.
>

Thanks.
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v4 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms
  2021-07-24 14:14   ` [PATCH v3 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms ZheNing Hu via GitGitGadget
                       ` (4 preceding siblings ...)
  2021-07-24 14:14     ` [PATCH v3 5/5] [GSOC] ref-filter: add %(rest) atom ZheNing Hu via GitGitGadget
@ 2021-07-26  3:26     ` ZheNing Hu via GitGitGadget
  2021-07-26  3:26       ` [PATCH v4 1/5] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
                         ` (4 more replies)
  5 siblings, 5 replies; 45+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-07-26  3:26 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
	Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, Jacob Keller, ZheNing Hu

This patch series is split from my main patch series zh/ref-filter-raw-data,
to make things easier for reviewers to read. It's last version is here:
https://lore.kernel.org/git/pull.993.v2.git.1626363626.gitgitgadget@gmail.com/

This patch series provided %(raw) and %(rest) for ref-filter, which will be
used by zh/cat-file-reuse-ref-filter-logic later.

v3--->v4:

 1. Change the type of atom_value->s_size from size_t to ssize_t, and change
    the type of quote_formatting() parameters len from size_t to ssize_t at
    the same time. All of if (v->s_size == -1) change to if (v->s_size < 0).
 2. Correct the commit subject of "[GSOC] ref-filter: --format=%(raw)
    re-support --perl" to "[GSOC] ref-filter: --format=%(raw) support
    --perl" and its commit messages.

v3:
https://lore.kernel.org/git/pull.1000.v3.git.1627136061.gitgitgadget@gmail.com/

ZheNing Hu (5):
  [GSOC] ref-filter: add obj-type check in grab contents
  [GSOC] ref-filter: add %(raw) atom
  [GSOC] ref-filter: --format=%(raw) support --perl
  [GSOC] ref-filter: use non-const ref_format in *_atom_parser()
  [GSOC] ref-filter: add %(rest) atom

 Documentation/git-for-each-ref.txt |   9 ++
 builtin/tag.c                      |   2 +-
 quote.c                            |  17 ++
 quote.h                            |   1 +
 ref-filter.c                       | 243 ++++++++++++++++++++++-------
 ref-filter.h                       |   9 +-
 t/t3203-branch-output.sh           |   4 +
 t/t6300-for-each-ref.sh            | 235 ++++++++++++++++++++++++++++
 t/t7004-tag.sh                     |   4 +
 t/t7030-verify-tag.sh              |   4 +
 10 files changed, 465 insertions(+), 63 deletions(-)


base-commit: daab8a564f8bbac55f70f8bf86c070e001a9b006
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1000%2Fadlternative%2Fref-filter-raw-data-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1000/adlternative/ref-filter-raw-data-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1000

Range-diff vs v3:

 1:  d77268bb188 ! 1:  0e294ff1626 [GSOC] ref-filter: add obj-type check in grab contents
     @@ Commit message
          early when the given object is of a wrong type to help later
          addition to handle other types of objects in this function.
      
     +    Reviewed-by: Jacob Keller <jacob.keller@gmail.com>
          Mentored-by: Christian Couder <christian.couder@gmail.com>
          Mentored-by: Hariom Verma <hariom18599@gmail.com>
          Helped-by: Junio C Hamano <gitster@pobox.com>
 2:  0ffa549da6e ! 2:  96d79a3d16b [GSOC] ref-filter: add %(raw) atom
     @@ Commit message
          variable in such languages, these may not support arbitrary binary data
          in their string variable type.
      
     +    Reviewed-by: Jacob Keller <jacob.keller@gmail.com>
          Mentored-by: Christian Couder <christian.couder@gmail.com>
          Mentored-by: Hariom Verma <hariom18599@gmail.com>
          Helped-by: Bagas Sanjaya <bagasdotme@gmail.com>
     @@ ref-filter.c: struct ref_formatting_state {
       
       struct atom_value {
       	const char *s;
     -+	size_t s_size;
     ++	ssize_t s_size;
       	int (*handler)(struct atom_value *atomv, struct ref_formatting_state *state,
       		       struct strbuf *err);
       	uintmax_t value; /* used for sorting when not FIELD_STR */
     @@ ref-filter.c: static int parse_ref_filter_atom(const struct ref_format *format,
       }
       
      -static void quote_formatting(struct strbuf *s, const char *str, int quote_style)
     -+static void quote_formatting(struct strbuf *s, const char *str, size_t len, int quote_style)
     ++static void quote_formatting(struct strbuf *s, const char *str, ssize_t len, int quote_style)
       {
       	switch (quote_style) {
       	case QUOTE_NONE:
      -		strbuf_addstr(s, str);
     -+		if (len != -1)
     -+			strbuf_add(s, str, len);
     -+		else
     ++		if (len < 0)
      +			strbuf_addstr(s, str);
     ++		else
     ++			strbuf_add(s, str, len);
       		break;
       	case QUOTE_SHELL:
       		sq_quote_buf(s, str);
     @@ ref-filter.c: static int append_atom(struct atom_value *v, struct ref_formatting
       	 */
       	if (!state->stack->prev)
      -		quote_formatting(&state->stack->output, v->s, state->quote_style);
     +-	else
      +		quote_formatting(&state->stack->output, v->s, v->s_size, state->quote_style);
     - 	else
     --		strbuf_addstr(&state->stack->output, v->s);
     -+		if (v->s_size != ATOM_SIZE_UNSPECIFIED)
     -+			strbuf_add(&state->stack->output, v->s, v->s_size);
     -+		else
     -+			strbuf_addstr(&state->stack->output, v->s);
     ++	else if (v->s_size < 0)
     + 		strbuf_addstr(&state->stack->output, v->s);
     ++	else
     ++		strbuf_add(&state->stack->output, v->s, v->s_size);
       	return 0;
       }
       
     @@ ref-filter.c: static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array
      -		cmp_fn = s->sort_flags & REF_SORTING_ICASE
      -			? strcasecmp : strcmp;
      -		cmp = cmp_fn(va->s, vb->s);
     -+		if (va->s_size == ATOM_SIZE_UNSPECIFIED && vb->s_size == ATOM_SIZE_UNSPECIFIED) {
     ++		if (va->s_size < 0 && vb->s_size < 0) {
      +			int (*cmp_fn)(const char *, const char *);
      +			cmp_fn = s->sort_flags & REF_SORTING_ICASE
      +				? strcasecmp : strcmp;
      +			cmp = cmp_fn(va->s, vb->s);
      +		} else {
     -+			size_t a_size = va->s_size == ATOM_SIZE_UNSPECIFIED ?
     ++			size_t a_size = va->s_size < 0 ?
      +					strlen(va->s) : va->s_size;
     -+			size_t b_size = vb->s_size == ATOM_SIZE_UNSPECIFIED ?
     ++			size_t b_size = vb->s_size < 0 ?
      +					strlen(vb->s) : vb->s_size;
      +			int (*cmp_fn)(const void *, const void *, size_t);
      +			cmp_fn = s->sort_flags & REF_SORTING_ICASE
 3:  ac5d98647da ! 3:  d14fae52080 [GSOC] ref-filter: --format=%(raw) re-support --perl
     @@ Metadata
      Author: ZheNing Hu <adlternative@gmail.com>
      
       ## Commit message ##
     -    [GSOC] ref-filter: --format=%(raw) re-support --perl
     +    [GSOC] ref-filter: --format=%(raw) support --perl
      
          Because the perl language can handle binary data correctly,
          add the function perl_quote_buf_with_len(), which can specify
          the length of the data and prevent the data from being truncated
     -    at '\0' to help `--format="%(raw)"` re-support `--perl`.
     +    at '\0' to help `--format="%(raw)"` support `--perl`.
      
     +    Reviewed-by: Jacob Keller <jacob.keller@gmail.com>
          Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
          Signed-off-by: ZheNing Hu <adlternative@gmail.com>
      
     @@ quote.h: char *quote_path(const char *in, const char *prefix, struct strbuf *out
       void basic_regex_quote_buf(struct strbuf *sb, const char *src);
      
       ## ref-filter.c ##
     -@@ ref-filter.c: static void quote_formatting(struct strbuf *s, const char *str, size_t len, int
     +@@ ref-filter.c: static void quote_formatting(struct strbuf *s, const char *str, ssize_t len, int
       		sq_quote_buf(s, str);
       		break;
       	case QUOTE_PERL:
      -		perl_quote_buf(s, str);
     -+		if (len != -1)
     -+			perl_quote_buf_with_len(s, str, len);
     -+		else
     ++		if (len < 0)
      +			perl_quote_buf(s, str);
     ++		else
     ++			perl_quote_buf_with_len(s, str, len);
       		break;
       	case QUOTE_PYTHON:
       		python_quote_buf(s, str);
 4:  0722f546df9 = 4:  0aa398e8a8f [GSOC] ref-filter: use non-const ref_format in *_atom_parser()
 5:  2300b0463f2 ! 5:  207cc512964 [GSOC] ref-filter: add %(rest) atom
     @@ Commit message
          Introduce the reject_atom() to reject the atom %(rest) for
          "git for-each-ref", "git branch", "git tag" and "git verify-tag".
      
     +    Reviewed-by: Jacob Keller <jacob.keller@gmail.com>
          Suggected-by: Jacob Keller <jacob.keller@gmail.com>
          Mentored-by: Christian Couder <christian.couder@gmail.com>
          Mentored-by: Hariom Verma <hariom18599@gmail.com>

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v4 1/5] [GSOC] ref-filter: add obj-type check in grab contents
  2021-07-26  3:26     ` [PATCH v4 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms ZheNing Hu via GitGitGadget
@ 2021-07-26  3:26       ` ZheNing Hu via GitGitGadget
  2021-07-26 19:15         ` Junio C Hamano
  2021-07-26  3:26       ` [PATCH v4 2/5] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
                         ` (3 subsequent siblings)
  4 siblings, 1 reply; 45+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-07-26  3:26 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
	Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, Jacob Keller, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Only tag and commit objects use `grab_sub_body_contents()` to grab
object contents in the current codebase.  We want to teach the
function to also handle blobs and trees to get their raw data,
without parsing a blob (whose contents looks like a commit or a tag)
incorrectly as a commit or a tag. So it's needed to pass a
`struct expand_data *data` instread of only `void *buf` to both
`grab_sub_body_contents()` and `grab_values()` to be able to check
the object type.

Skip the block of code that is specific to handling commits and tags
early when the given object is of a wrong type to help later
addition to handle other types of objects in this function.

Reviewed-by: Jacob Keller <jacob.keller@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 ref-filter.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 4db0e40ff4c..5cee6512fba 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1356,11 +1356,12 @@ static void append_lines(struct strbuf *out, const char *buf, unsigned long size
 }
 
 /* See grab_values */
-static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
+static void grab_sub_body_contents(struct atom_value *val, int deref, struct expand_data *data)
 {
 	int i;
 	const char *subpos = NULL, *bodypos = NULL, *sigpos = NULL;
 	size_t sublen = 0, bodylen = 0, nonsiglen = 0, siglen = 0;
+	void *buf = data->content;
 
 	for (i = 0; i < used_atom_cnt; i++) {
 		struct used_atom *atom = &used_atom[i];
@@ -1371,10 +1372,13 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
 			continue;
 		if (deref)
 			name++;
-		if (strcmp(name, "body") &&
-		    !starts_with(name, "subject") &&
-		    !starts_with(name, "trailers") &&
-		    !starts_with(name, "contents"))
+
+		if ((data->type != OBJ_TAG &&
+		     data->type != OBJ_COMMIT) ||
+		    (strcmp(name, "body") &&
+		     !starts_with(name, "subject") &&
+		     !starts_with(name, "trailers") &&
+		     !starts_with(name, "contents")))
 			continue;
 		if (!subpos)
 			find_subpos(buf,
@@ -1438,17 +1442,19 @@ static void fill_missing_values(struct atom_value *val)
  * pointed at by the ref itself; otherwise it is the object the
  * ref (which is a tag) refers to.
  */
-static void grab_values(struct atom_value *val, int deref, struct object *obj, void *buf)
+static void grab_values(struct atom_value *val, int deref, struct object *obj, struct expand_data *data)
 {
+	void *buf = data->content;
+
 	switch (obj->type) {
 	case OBJ_TAG:
 		grab_tag_values(val, deref, obj);
-		grab_sub_body_contents(val, deref, buf);
+		grab_sub_body_contents(val, deref, data);
 		grab_person("tagger", val, deref, buf);
 		break;
 	case OBJ_COMMIT:
 		grab_commit_values(val, deref, obj);
-		grab_sub_body_contents(val, deref, buf);
+		grab_sub_body_contents(val, deref, data);
 		grab_person("author", val, deref, buf);
 		grab_person("committer", val, deref, buf);
 		break;
@@ -1678,7 +1684,7 @@ static int get_object(struct ref_array_item *ref, int deref, struct object **obj
 			return strbuf_addf_ret(err, -1, _("parse_object_buffer failed on %s for %s"),
 					       oid_to_hex(&oi->oid), ref->refname);
 		}
-		grab_values(ref->value, deref, *obj, oi->content);
+		grab_values(ref->value, deref, *obj, oi);
 	}
 
 	grab_common_values(ref->value, deref, oi);
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v4 2/5] [GSOC] ref-filter: add %(raw) atom
  2021-07-26  3:26     ` [PATCH v4 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms ZheNing Hu via GitGitGadget
  2021-07-26  3:26       ` [PATCH v4 1/5] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
@ 2021-07-26  3:26       ` ZheNing Hu via GitGitGadget
  2021-07-26  3:26       ` [PATCH v4 3/5] [GSOC] ref-filter: --format=%(raw) support --perl ZheNing Hu via GitGitGadget
                         ` (2 subsequent siblings)
  4 siblings, 0 replies; 45+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-07-26  3:26 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
	Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, Jacob Keller, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Add new formatting option `%(raw)`, which will print the raw
object data without any changes. It will help further to migrate
all cat-file formatting logic from cat-file to ref-filter.

The raw data of blob, tree objects may contain '\0', but most of
the logic in `ref-filter` depends on the output of the atom being
text (specifically, no embedded NULs in it).

E.g. `quote_formatting()` use `strbuf_addstr()` or `*._quote_buf()`
add the data to the buffer. The raw data of a tree object is
`100644 one\0...`, only the `100644 one` will be added to the buffer,
which is incorrect.

Therefore, we need to find a way to record the length of the
atom_value's member `s`. Although strbuf can already record the
string and its length, if we want to replace the type of atom_value's
member `s` with strbuf, many places in ref-filter that are filled
with dynamically allocated mermory in `v->s` are not easy to replace.
At the same time, we need to check if `v->s == NULL` in
populate_value(), and strbuf cannot easily distinguish NULL and empty
strings, but c-style "const char *" can do it. So add a new member in
`struct atom_value`: `s_size`, which can record raw object size, it
can help us add raw object data to the buffer or compare two buffers
which contain raw object data.

Note that `--format=%(raw)` cannot be used with `--python`, `--shell`,
`--tcl`, and `--perl` because if the binary raw data is passed to a
variable in such languages, these may not support arbitrary binary data
in their string variable type.

Reviewed-by: Jacob Keller <jacob.keller@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Helped-by: Bagas Sanjaya <bagasdotme@gmail.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Felipe Contreras <felipe.contreras@gmail.com>
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Based-on-patch-by: Olga Telezhnaya <olyatelezhnaya@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 Documentation/git-for-each-ref.txt |   9 ++
 ref-filter.c                       | 143 +++++++++++++++----
 t/t6300-for-each-ref.sh            | 216 +++++++++++++++++++++++++++++
 3 files changed, 340 insertions(+), 28 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index 2ae2478de70..cbb6f87d13f 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -235,6 +235,15 @@ and `date` to extract the named component.  For email fields (`authoremail`,
 without angle brackets, and `:localpart` to get the part before the `@` symbol
 out of the trimmed email.
 
+The raw data in an object is `raw`.
+
+raw:size::
+	The raw data size of the object.
+
+Note that `--format=%(raw)` can not be used with `--python`, `--shell`, `--tcl`,
+`--perl` because such language may not support arbitrary binary data in their
+string variable type.
+
 The message in a commit or a tag object is `contents`, from which
 `contents:<part>` can be used to extract various parts out of:
 
diff --git a/ref-filter.c b/ref-filter.c
index 5cee6512fba..0d5eb91ed54 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -144,6 +144,7 @@ enum atom_type {
 	ATOM_BODY,
 	ATOM_TRAILERS,
 	ATOM_CONTENTS,
+	ATOM_RAW,
 	ATOM_UPSTREAM,
 	ATOM_PUSH,
 	ATOM_SYMREF,
@@ -189,6 +190,9 @@ static struct used_atom {
 			struct process_trailer_options trailer_opts;
 			unsigned int nlines;
 		} contents;
+		struct {
+			enum { RAW_BARE, RAW_LENGTH } option;
+		} raw_data;
 		struct {
 			cmp_status cmp_status;
 			const char *str;
@@ -426,6 +430,18 @@ static int contents_atom_parser(const struct ref_format *format, struct used_ato
 	return 0;
 }
 
+static int raw_atom_parser(const struct ref_format *format, struct used_atom *atom,
+				const char *arg, struct strbuf *err)
+{
+	if (!arg)
+		atom->u.raw_data.option = RAW_BARE;
+	else if (!strcmp(arg, "size"))
+		atom->u.raw_data.option = RAW_LENGTH;
+	else
+		return strbuf_addf_ret(err, -1, _("unrecognized %%(raw) argument: %s"), arg);
+	return 0;
+}
+
 static int oid_atom_parser(const struct ref_format *format, struct used_atom *atom,
 			   const char *arg, struct strbuf *err)
 {
@@ -586,6 +602,7 @@ static struct {
 	[ATOM_BODY] = { "body", SOURCE_OBJ, FIELD_STR, body_atom_parser },
 	[ATOM_TRAILERS] = { "trailers", SOURCE_OBJ, FIELD_STR, trailers_atom_parser },
 	[ATOM_CONTENTS] = { "contents", SOURCE_OBJ, FIELD_STR, contents_atom_parser },
+	[ATOM_RAW] = { "raw", SOURCE_OBJ, FIELD_STR, raw_atom_parser },
 	[ATOM_UPSTREAM] = { "upstream", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
 	[ATOM_PUSH] = { "push", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
 	[ATOM_SYMREF] = { "symref", SOURCE_NONE, FIELD_STR, refname_atom_parser },
@@ -620,12 +637,19 @@ struct ref_formatting_state {
 
 struct atom_value {
 	const char *s;
+	ssize_t s_size;
 	int (*handler)(struct atom_value *atomv, struct ref_formatting_state *state,
 		       struct strbuf *err);
 	uintmax_t value; /* used for sorting when not FIELD_STR */
 	struct used_atom *atom;
 };
 
+#define ATOM_SIZE_UNSPECIFIED (-1)
+
+#define ATOM_VALUE_INIT { \
+	.s_size = ATOM_SIZE_UNSPECIFIED \
+}
+
 /*
  * Used to parse format string and sort specifiers
  */
@@ -644,13 +668,6 @@ static int parse_ref_filter_atom(const struct ref_format *format,
 		return strbuf_addf_ret(err, -1, _("malformed field name: %.*s"),
 				       (int)(ep-atom), atom);
 
-	/* Do we have the atom already used elsewhere? */
-	for (i = 0; i < used_atom_cnt; i++) {
-		int len = strlen(used_atom[i].name);
-		if (len == ep - atom && !memcmp(used_atom[i].name, atom, len))
-			return i;
-	}
-
 	/*
 	 * If the atom name has a colon, strip it and everything after
 	 * it off - it specifies the format for this entry, and
@@ -660,6 +677,13 @@ static int parse_ref_filter_atom(const struct ref_format *format,
 	arg = memchr(sp, ':', ep - sp);
 	atom_len = (arg ? arg : ep) - sp;
 
+	/* Do we have the atom already used elsewhere? */
+	for (i = 0; i < used_atom_cnt; i++) {
+		int len = strlen(used_atom[i].name);
+		if (len == ep - atom && !memcmp(used_atom[i].name, atom, len))
+			return i;
+	}
+
 	/* Is the atom a valid one? */
 	for (i = 0; i < ARRAY_SIZE(valid_atom); i++) {
 		int len = strlen(valid_atom[i].name);
@@ -709,11 +733,14 @@ static int parse_ref_filter_atom(const struct ref_format *format,
 	return at;
 }
 
-static void quote_formatting(struct strbuf *s, const char *str, int quote_style)
+static void quote_formatting(struct strbuf *s, const char *str, ssize_t len, int quote_style)
 {
 	switch (quote_style) {
 	case QUOTE_NONE:
-		strbuf_addstr(s, str);
+		if (len < 0)
+			strbuf_addstr(s, str);
+		else
+			strbuf_add(s, str, len);
 		break;
 	case QUOTE_SHELL:
 		sq_quote_buf(s, str);
@@ -740,9 +767,11 @@ static int append_atom(struct atom_value *v, struct ref_formatting_state *state,
 	 * encountered.
 	 */
 	if (!state->stack->prev)
-		quote_formatting(&state->stack->output, v->s, state->quote_style);
-	else
+		quote_formatting(&state->stack->output, v->s, v->s_size, state->quote_style);
+	else if (v->s_size < 0)
 		strbuf_addstr(&state->stack->output, v->s);
+	else
+		strbuf_add(&state->stack->output, v->s, v->s_size);
 	return 0;
 }
 
@@ -842,21 +871,23 @@ static int if_atom_handler(struct atom_value *atomv, struct ref_formatting_state
 	return 0;
 }
 
-static int is_empty(const char *s)
+static int is_empty(struct strbuf *buf)
 {
-	while (*s != '\0') {
-		if (!isspace(*s))
-			return 0;
-		s++;
-	}
-	return 1;
-}
+	const char *cur = buf->buf;
+	const char *end = buf->buf + buf->len;
+
+	while (cur != end && (isspace(*cur)))
+		cur++;
+
+	return cur == end;
+ }
 
 static int then_atom_handler(struct atom_value *atomv, struct ref_formatting_state *state,
 			     struct strbuf *err)
 {
 	struct ref_formatting_stack *cur = state->stack;
 	struct if_then_else *if_then_else = NULL;
+	size_t str_len = 0;
 
 	if (cur->at_end == if_then_else_handler)
 		if_then_else = (struct if_then_else *)cur->at_end_data;
@@ -867,18 +898,22 @@ static int then_atom_handler(struct atom_value *atomv, struct ref_formatting_sta
 	if (if_then_else->else_atom_seen)
 		return strbuf_addf_ret(err, -1, _("format: %%(then) atom used after %%(else)"));
 	if_then_else->then_atom_seen = 1;
+	if (if_then_else->str)
+		str_len = strlen(if_then_else->str);
 	/*
 	 * If the 'equals' or 'notequals' attribute is used then
 	 * perform the required comparison. If not, only non-empty
 	 * strings satisfy the 'if' condition.
 	 */
 	if (if_then_else->cmp_status == COMPARE_EQUAL) {
-		if (!strcmp(if_then_else->str, cur->output.buf))
+		if (str_len == cur->output.len &&
+		    !memcmp(if_then_else->str, cur->output.buf, cur->output.len))
 			if_then_else->condition_satisfied = 1;
 	} else if (if_then_else->cmp_status == COMPARE_UNEQUAL) {
-		if (strcmp(if_then_else->str, cur->output.buf))
+		if (str_len != cur->output.len ||
+		    memcmp(if_then_else->str, cur->output.buf, cur->output.len))
 			if_then_else->condition_satisfied = 1;
-	} else if (cur->output.len && !is_empty(cur->output.buf))
+	} else if (cur->output.len && !is_empty(&cur->output))
 		if_then_else->condition_satisfied = 1;
 	strbuf_reset(&cur->output);
 	return 0;
@@ -924,7 +959,7 @@ static int end_atom_handler(struct atom_value *atomv, struct ref_formatting_stat
 	 * only on the topmost supporting atom.
 	 */
 	if (!current->prev->prev) {
-		quote_formatting(&s, current->output.buf, state->quote_style);
+		quote_formatting(&s, current->output.buf, current->output.len, state->quote_style);
 		strbuf_swap(&current->output, &s);
 	}
 	strbuf_release(&s);
@@ -974,6 +1009,10 @@ int verify_ref_format(struct ref_format *format)
 		at = parse_ref_filter_atom(format, sp + 2, ep, &err);
 		if (at < 0)
 			die("%s", err.buf);
+		if (format->quote_style && used_atom[at].atom_type == ATOM_RAW &&
+		    used_atom[at].u.raw_data.option == RAW_BARE)
+			die(_("--format=%.*s cannot be used with"
+			      "--python, --shell, --tcl, --perl"), (int)(ep - sp - 2), sp + 2);
 		cp = ep + 1;
 
 		if (skip_prefix(used_atom[at].name, "color:", &color))
@@ -1367,12 +1406,25 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, struct exp
 		struct used_atom *atom = &used_atom[i];
 		const char *name = atom->name;
 		struct atom_value *v = &val[i];
+		enum atom_type atom_type = atom->atom_type;
 
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
 
+		if (atom_type == ATOM_RAW) {
+			unsigned long buf_size = data->size;
+
+			if (atom->u.raw_data.option == RAW_BARE) {
+				v->s = xmemdupz(buf, buf_size);
+				v->s_size = buf_size;
+			} else if (atom->u.raw_data.option == RAW_LENGTH) {
+				v->s = xstrfmt("%"PRIuMAX, (uintmax_t)buf_size);
+			}
+			continue;
+		}
+
 		if ((data->type != OBJ_TAG &&
 		     data->type != OBJ_COMMIT) ||
 		    (strcmp(name, "body") &&
@@ -1460,9 +1512,11 @@ static void grab_values(struct atom_value *val, int deref, struct object *obj, s
 		break;
 	case OBJ_TREE:
 		/* grab_tree_values(val, deref, obj, buf, sz); */
+		grab_sub_body_contents(val, deref, data);
 		break;
 	case OBJ_BLOB:
 		/* grab_blob_values(val, deref, obj, buf, sz); */
+		grab_sub_body_contents(val, deref, data);
 		break;
 	default:
 		die("Eh?  Object of type %d?", obj->type);
@@ -1766,6 +1820,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 		const char *refname;
 		struct branch *branch = NULL;
 
+		v->s_size = ATOM_SIZE_UNSPECIFIED;
 		v->handler = append_atom;
 		v->atom = atom;
 
@@ -2369,6 +2424,19 @@ static int compare_detached_head(struct ref_array_item *a, struct ref_array_item
 	return 0;
 }
 
+static int memcasecmp(const void *vs1, const void *vs2, size_t n)
+{
+	const char *s1 = vs1, *s2 = vs2;
+	const char *end = s1 + n;
+
+	for (; s1 < end; s1++, s2++) {
+		int diff = tolower(*s1) - tolower(*s2);
+		if (diff)
+			return diff;
+	}
+	return 0;
+}
+
 static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, struct ref_array_item *b)
 {
 	struct atom_value *va, *vb;
@@ -2389,10 +2457,29 @@ static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, stru
 	} else if (s->sort_flags & REF_SORTING_VERSION) {
 		cmp = versioncmp(va->s, vb->s);
 	} else if (cmp_type == FIELD_STR) {
-		int (*cmp_fn)(const char *, const char *);
-		cmp_fn = s->sort_flags & REF_SORTING_ICASE
-			? strcasecmp : strcmp;
-		cmp = cmp_fn(va->s, vb->s);
+		if (va->s_size < 0 && vb->s_size < 0) {
+			int (*cmp_fn)(const char *, const char *);
+			cmp_fn = s->sort_flags & REF_SORTING_ICASE
+				? strcasecmp : strcmp;
+			cmp = cmp_fn(va->s, vb->s);
+		} else {
+			size_t a_size = va->s_size < 0 ?
+					strlen(va->s) : va->s_size;
+			size_t b_size = vb->s_size < 0 ?
+					strlen(vb->s) : vb->s_size;
+			int (*cmp_fn)(const void *, const void *, size_t);
+			cmp_fn = s->sort_flags & REF_SORTING_ICASE
+				? memcasecmp : memcmp;
+
+			cmp = cmp_fn(va->s, vb->s, b_size > a_size ?
+				     a_size : b_size);
+			if (!cmp) {
+				if (a_size > b_size)
+					cmp = 1;
+				else if (a_size < b_size)
+					cmp = -1;
+			}
+		}
 	} else {
 		if (va->value < vb->value)
 			cmp = -1;
@@ -2491,7 +2578,7 @@ int format_ref_array_item(struct ref_array_item *info,
 		append_literal(cp, sp, &state);
 	}
 	if (format->need_color_reset_at_eol) {
-		struct atom_value resetv;
+		struct atom_value resetv = ATOM_VALUE_INIT;
 		resetv.s = GIT_COLOR_RESET;
 		if (append_atom(&resetv, &state, error_buf)) {
 			pop_stack_element(&state.stack);
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 9e0214076b4..18554f62d94 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -130,6 +130,8 @@ test_atom head parent:short=10 ''
 test_atom head numparent 0
 test_atom head object ''
 test_atom head type ''
+test_atom head raw "$(git cat-file commit refs/heads/main)
+"
 test_atom head '*objectname' ''
 test_atom head '*objecttype' ''
 test_atom head author 'A U Thor <author@example.com> 1151968724 +0200'
@@ -221,6 +223,15 @@ test_atom tag contents 'Tagging at 1151968727
 '
 test_atom tag HEAD ' '
 
+test_expect_success 'basic atom: refs/tags/testtag *raw' '
+	git cat-file commit refs/tags/testtag^{} >expected &&
+	git for-each-ref --format="%(*raw)" refs/tags/testtag >actual &&
+	sanitize_pgp <expected >expected.clean &&
+	echo >>expected.clean &&
+	sanitize_pgp <actual >actual.clean &&
+	test_cmp expected.clean actual.clean
+'
+
 test_expect_success 'Check invalid atoms names are errors' '
 	test_must_fail git for-each-ref --format="%(INVALID)" refs/heads
 '
@@ -686,6 +697,15 @@ test_atom refs/tags/signed-empty contents:body ''
 test_atom refs/tags/signed-empty contents:signature "$sig"
 test_atom refs/tags/signed-empty contents "$sig"
 
+test_expect_success GPG 'basic atom: refs/tags/signed-empty raw' '
+	git cat-file tag refs/tags/signed-empty >expected &&
+	git for-each-ref --format="%(raw)" refs/tags/signed-empty >actual &&
+	sanitize_pgp <expected >expected.clean &&
+	echo >>expected.clean &&
+	sanitize_pgp <actual >actual.clean &&
+	test_cmp expected.clean actual.clean
+'
+
 test_atom refs/tags/signed-short subject 'subject line'
 test_atom refs/tags/signed-short subject:sanitize 'subject-line'
 test_atom refs/tags/signed-short contents:subject 'subject line'
@@ -695,6 +715,15 @@ test_atom refs/tags/signed-short contents:signature "$sig"
 test_atom refs/tags/signed-short contents "subject line
 $sig"
 
+test_expect_success GPG 'basic atom: refs/tags/signed-short raw' '
+	git cat-file tag refs/tags/signed-short >expected &&
+	git for-each-ref --format="%(raw)" refs/tags/signed-short >actual &&
+	sanitize_pgp <expected >expected.clean &&
+	echo >>expected.clean &&
+	sanitize_pgp <actual >actual.clean &&
+	test_cmp expected.clean actual.clean
+'
+
 test_atom refs/tags/signed-long subject 'subject line'
 test_atom refs/tags/signed-long subject:sanitize 'subject-line'
 test_atom refs/tags/signed-long contents:subject 'subject line'
@@ -708,6 +737,15 @@ test_atom refs/tags/signed-long contents "subject line
 body contents
 $sig"
 
+test_expect_success GPG 'basic atom: refs/tags/signed-long raw' '
+	git cat-file tag refs/tags/signed-long >expected &&
+	git for-each-ref --format="%(raw)" refs/tags/signed-long >actual &&
+	sanitize_pgp <expected >expected.clean &&
+	echo >>expected.clean &&
+	sanitize_pgp <actual >actual.clean &&
+	test_cmp expected.clean actual.clean
+'
+
 test_expect_success 'set up refs pointing to tree and blob' '
 	git update-ref refs/mytrees/first refs/heads/main^{tree} &&
 	git update-ref refs/myblobs/first refs/heads/main:one
@@ -720,6 +758,16 @@ test_atom refs/mytrees/first contents:body ""
 test_atom refs/mytrees/first contents:signature ""
 test_atom refs/mytrees/first contents ""
 
+test_expect_success 'basic atom: refs/mytrees/first raw' '
+	git cat-file tree refs/mytrees/first >expected &&
+	echo >>expected &&
+	git for-each-ref --format="%(raw)" refs/mytrees/first >actual &&
+	test_cmp expected actual &&
+	git cat-file -s refs/mytrees/first >expected &&
+	git for-each-ref --format="%(raw:size)" refs/mytrees/first >actual &&
+	test_cmp expected actual
+'
+
 test_atom refs/myblobs/first subject ""
 test_atom refs/myblobs/first contents:subject ""
 test_atom refs/myblobs/first body ""
@@ -727,6 +775,174 @@ test_atom refs/myblobs/first contents:body ""
 test_atom refs/myblobs/first contents:signature ""
 test_atom refs/myblobs/first contents ""
 
+test_expect_success 'basic atom: refs/myblobs/first raw' '
+	git cat-file blob refs/myblobs/first >expected &&
+	echo >>expected &&
+	git for-each-ref --format="%(raw)" refs/myblobs/first >actual &&
+	test_cmp expected actual &&
+	git cat-file -s refs/myblobs/first >expected &&
+	git for-each-ref --format="%(raw:size)" refs/myblobs/first >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'set up refs pointing to binary blob' '
+	printf "a\0b\0c" >blob1 &&
+	printf "a\0c\0b" >blob2 &&
+	printf "\0a\0b\0c" >blob3 &&
+	printf "abc" >blob4 &&
+	printf "\0 \0 \0 " >blob5 &&
+	printf "\0 \0a\0 " >blob6 &&
+	printf "  " >blob7 &&
+	>blob8 &&
+	obj=$(git hash-object -w blob1) &&
+	git update-ref refs/myblobs/blob1 "$obj" &&
+	obj=$(git hash-object -w blob2) &&
+	git update-ref refs/myblobs/blob2 "$obj" &&
+	obj=$(git hash-object -w blob3) &&
+	git update-ref refs/myblobs/blob3 "$obj" &&
+	obj=$(git hash-object -w blob4) &&
+	git update-ref refs/myblobs/blob4 "$obj" &&
+	obj=$(git hash-object -w blob5) &&
+	git update-ref refs/myblobs/blob5 "$obj" &&
+	obj=$(git hash-object -w blob6) &&
+	git update-ref refs/myblobs/blob6 "$obj" &&
+	obj=$(git hash-object -w blob7) &&
+	git update-ref refs/myblobs/blob7 "$obj" &&
+	obj=$(git hash-object -w blob8) &&
+	git update-ref refs/myblobs/blob8 "$obj"
+'
+
+test_expect_success 'Verify sorts with raw' '
+	cat >expected <<-EOF &&
+	refs/myblobs/blob8
+	refs/myblobs/blob5
+	refs/myblobs/blob6
+	refs/myblobs/blob3
+	refs/myblobs/blob7
+	refs/mytrees/first
+	refs/myblobs/first
+	refs/myblobs/blob1
+	refs/myblobs/blob2
+	refs/myblobs/blob4
+	refs/heads/main
+	EOF
+	git for-each-ref --format="%(refname)" --sort=raw \
+		refs/heads/main refs/myblobs/ refs/mytrees/first >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'Verify sorts with raw:size' '
+	cat >expected <<-EOF &&
+	refs/myblobs/blob8
+	refs/myblobs/first
+	refs/myblobs/blob7
+	refs/heads/main
+	refs/myblobs/blob4
+	refs/myblobs/blob1
+	refs/myblobs/blob2
+	refs/myblobs/blob3
+	refs/myblobs/blob5
+	refs/myblobs/blob6
+	refs/mytrees/first
+	EOF
+	git for-each-ref --format="%(refname)" --sort=raw:size \
+		refs/heads/main refs/myblobs/ refs/mytrees/first >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'validate raw atom with %(if:equals)' '
+	cat >expected <<-EOF &&
+	not equals
+	not equals
+	not equals
+	not equals
+	not equals
+	not equals
+	refs/myblobs/blob4
+	not equals
+	not equals
+	not equals
+	not equals
+	not equals
+	EOF
+	git for-each-ref --format="%(if:equals=abc)%(raw)%(then)%(refname)%(else)not equals%(end)" \
+		refs/myblobs/ refs/heads/ >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'validate raw atom with %(if:notequals)' '
+	cat >expected <<-EOF &&
+	refs/heads/ambiguous
+	refs/heads/main
+	refs/heads/newtag
+	refs/myblobs/blob1
+	refs/myblobs/blob2
+	refs/myblobs/blob3
+	equals
+	refs/myblobs/blob5
+	refs/myblobs/blob6
+	refs/myblobs/blob7
+	refs/myblobs/blob8
+	refs/myblobs/first
+	EOF
+	git for-each-ref --format="%(if:notequals=abc)%(raw)%(then)%(refname)%(else)equals%(end)" \
+		refs/myblobs/ refs/heads/ >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'empty raw refs with %(if)' '
+	cat >expected <<-EOF &&
+	refs/myblobs/blob1 not empty
+	refs/myblobs/blob2 not empty
+	refs/myblobs/blob3 not empty
+	refs/myblobs/blob4 not empty
+	refs/myblobs/blob5 not empty
+	refs/myblobs/blob6 not empty
+	refs/myblobs/blob7 empty
+	refs/myblobs/blob8 empty
+	refs/myblobs/first not empty
+	EOF
+	git for-each-ref --format="%(refname) %(if)%(raw)%(then)not empty%(else)empty%(end)" \
+		refs/myblobs/ >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success '%(raw) with --python must fail' '
+	test_must_fail git for-each-ref --format="%(raw)" --python
+'
+
+test_expect_success '%(raw) with --tcl must fail' '
+	test_must_fail git for-each-ref --format="%(raw)" --tcl
+'
+
+test_expect_success '%(raw) with --perl must fail' '
+	test_must_fail git for-each-ref --format="%(raw)" --perl
+'
+
+test_expect_success '%(raw) with --shell must fail' '
+	test_must_fail git for-each-ref --format="%(raw)" --shell
+'
+
+test_expect_success '%(raw) with --shell and --sort=raw must fail' '
+	test_must_fail git for-each-ref --format="%(raw)" --sort=raw --shell
+'
+
+test_expect_success '%(raw:size) with --shell' '
+	git for-each-ref --format="%(raw:size)" | while read line
+	do
+		echo "'\''$line'\''" >>expect
+	done &&
+	git for-each-ref --format="%(raw:size)" --shell >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'for-each-ref --format compare with cat-file --batch' '
+	git rev-parse refs/mytrees/first | git cat-file --batch >expected &&
+	git for-each-ref --format="%(objectname) %(objecttype) %(objectsize)
+%(raw)" refs/mytrees/first >actual &&
+	test_cmp expected actual
+'
+
 test_expect_success 'set up multiple-sort tags' '
 	for when in 100000 200000
 	do
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v4 3/5] [GSOC] ref-filter: --format=%(raw) support --perl
  2021-07-26  3:26     ` [PATCH v4 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms ZheNing Hu via GitGitGadget
  2021-07-26  3:26       ` [PATCH v4 1/5] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
  2021-07-26  3:26       ` [PATCH v4 2/5] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
@ 2021-07-26  3:26       ` ZheNing Hu via GitGitGadget
  2021-07-26  3:26       ` [PATCH v4 4/5] [GSOC] ref-filter: use non-const ref_format in *_atom_parser() ZheNing Hu via GitGitGadget
  2021-07-26  3:26       ` [PATCH v4 5/5] [GSOC] ref-filter: add %(rest) atom ZheNing Hu via GitGitGadget
  4 siblings, 0 replies; 45+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-07-26  3:26 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
	Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, Jacob Keller, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Because the perl language can handle binary data correctly,
add the function perl_quote_buf_with_len(), which can specify
the length of the data and prevent the data from being truncated
at '\0' to help `--format="%(raw)"` support `--perl`.

Reviewed-by: Jacob Keller <jacob.keller@gmail.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 Documentation/git-for-each-ref.txt |  4 ++--
 quote.c                            | 17 +++++++++++++++++
 quote.h                            |  1 +
 ref-filter.c                       | 15 +++++++++++----
 t/t6300-for-each-ref.sh            | 19 +++++++++++++++++--
 5 files changed, 48 insertions(+), 8 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index cbb6f87d13f..6da899c6296 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -241,8 +241,8 @@ raw:size::
 	The raw data size of the object.
 
 Note that `--format=%(raw)` can not be used with `--python`, `--shell`, `--tcl`,
-`--perl` because such language may not support arbitrary binary data in their
-string variable type.
+because such language may not support arbitrary binary data in their string
+variable type.
 
 The message in a commit or a tag object is `contents`, from which
 `contents:<part>` can be used to extract various parts out of:
diff --git a/quote.c b/quote.c
index 8a3a5e39eb1..26719d21d1e 100644
--- a/quote.c
+++ b/quote.c
@@ -471,6 +471,23 @@ void perl_quote_buf(struct strbuf *sb, const char *src)
 	strbuf_addch(sb, sq);
 }
 
+void perl_quote_buf_with_len(struct strbuf *sb, const char *src, size_t len)
+{
+	const char sq = '\'';
+	const char bq = '\\';
+	const char *c = src;
+	const char *end = src + len;
+
+	strbuf_addch(sb, sq);
+	while (c != end) {
+		if (*c == sq || *c == bq)
+			strbuf_addch(sb, bq);
+		strbuf_addch(sb, *c);
+		c++;
+	}
+	strbuf_addch(sb, sq);
+}
+
 void python_quote_buf(struct strbuf *sb, const char *src)
 {
 	const char sq = '\'';
diff --git a/quote.h b/quote.h
index 768cc6338e2..0fe69e264b0 100644
--- a/quote.h
+++ b/quote.h
@@ -94,6 +94,7 @@ char *quote_path(const char *in, const char *prefix, struct strbuf *out, unsigne
 
 /* quoting as a string literal for other languages */
 void perl_quote_buf(struct strbuf *sb, const char *src);
+void perl_quote_buf_with_len(struct strbuf *sb, const char *src, size_t len);
 void python_quote_buf(struct strbuf *sb, const char *src);
 void tcl_quote_buf(struct strbuf *sb, const char *src);
 void basic_regex_quote_buf(struct strbuf *sb, const char *src);
diff --git a/ref-filter.c b/ref-filter.c
index 0d5eb91ed54..ce6c2115016 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -746,7 +746,10 @@ static void quote_formatting(struct strbuf *s, const char *str, ssize_t len, int
 		sq_quote_buf(s, str);
 		break;
 	case QUOTE_PERL:
-		perl_quote_buf(s, str);
+		if (len < 0)
+			perl_quote_buf(s, str);
+		else
+			perl_quote_buf_with_len(s, str, len);
 		break;
 	case QUOTE_PYTHON:
 		python_quote_buf(s, str);
@@ -1009,10 +1012,14 @@ int verify_ref_format(struct ref_format *format)
 		at = parse_ref_filter_atom(format, sp + 2, ep, &err);
 		if (at < 0)
 			die("%s", err.buf);
-		if (format->quote_style && used_atom[at].atom_type == ATOM_RAW &&
-		    used_atom[at].u.raw_data.option == RAW_BARE)
+
+		if ((format->quote_style == QUOTE_PYTHON ||
+		     format->quote_style == QUOTE_SHELL ||
+		     format->quote_style == QUOTE_TCL) &&
+		     used_atom[at].atom_type == ATOM_RAW &&
+		     used_atom[at].u.raw_data.option == RAW_BARE)
 			die(_("--format=%.*s cannot be used with"
-			      "--python, --shell, --tcl, --perl"), (int)(ep - sp - 2), sp + 2);
+			      "--python, --shell, --tcl"), (int)(ep - sp - 2), sp + 2);
 		cp = ep + 1;
 
 		if (skip_prefix(used_atom[at].name, "color:", &color))
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 18554f62d94..3d15d0a5360 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -915,8 +915,23 @@ test_expect_success '%(raw) with --tcl must fail' '
 	test_must_fail git for-each-ref --format="%(raw)" --tcl
 '
 
-test_expect_success '%(raw) with --perl must fail' '
-	test_must_fail git for-each-ref --format="%(raw)" --perl
+test_expect_success '%(raw) with --perl' '
+	git for-each-ref --format="\$name= %(raw);
+print \"\$name\"" refs/myblobs/blob1 --perl | perl >actual &&
+	cmp blob1 actual &&
+	git for-each-ref --format="\$name= %(raw);
+print \"\$name\"" refs/myblobs/blob3 --perl | perl >actual &&
+	cmp blob3 actual &&
+	git for-each-ref --format="\$name= %(raw);
+print \"\$name\"" refs/myblobs/blob8 --perl | perl >actual &&
+	cmp blob8 actual &&
+	git for-each-ref --format="\$name= %(raw);
+print \"\$name\"" refs/myblobs/first --perl | perl >actual &&
+	cmp one actual &&
+	git cat-file tree refs/mytrees/first > expected &&
+	git for-each-ref --format="\$name= %(raw);
+print \"\$name\"" refs/mytrees/first --perl | perl >actual &&
+	cmp expected actual
 '
 
 test_expect_success '%(raw) with --shell must fail' '
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v4 4/5] [GSOC] ref-filter: use non-const ref_format in *_atom_parser()
  2021-07-26  3:26     ` [PATCH v4 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms ZheNing Hu via GitGitGadget
                         ` (2 preceding siblings ...)
  2021-07-26  3:26       ` [PATCH v4 3/5] [GSOC] ref-filter: --format=%(raw) support --perl ZheNing Hu via GitGitGadget
@ 2021-07-26  3:26       ` ZheNing Hu via GitGitGadget
  2021-07-26  3:26       ` [PATCH v4 5/5] [GSOC] ref-filter: add %(rest) atom ZheNing Hu via GitGitGadget
  4 siblings, 0 replies; 45+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-07-26  3:26 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
	Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, Jacob Keller, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Use non-const ref_format in *_atom_parser(), which can help us
modify the members of ref_format in *_atom_parser().

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 builtin/tag.c |  2 +-
 ref-filter.c  | 44 ++++++++++++++++++++++----------------------
 ref-filter.h  |  4 ++--
 3 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/builtin/tag.c b/builtin/tag.c
index 82fcfc09824..452558ec957 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -146,7 +146,7 @@ static int verify_tag(const char *name, const char *ref,
 		      const struct object_id *oid, void *cb_data)
 {
 	int flags;
-	const struct ref_format *format = cb_data;
+	struct ref_format *format = cb_data;
 	flags = GPG_VERIFY_VERBOSE;
 
 	if (format->format)
diff --git a/ref-filter.c b/ref-filter.c
index ce6c2115016..f944160c5ce 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -226,7 +226,7 @@ static int strbuf_addf_ret(struct strbuf *sb, int ret, const char *fmt, ...)
 	return ret;
 }
 
-static int color_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int color_atom_parser(struct ref_format *format, struct used_atom *atom,
 			     const char *color_value, struct strbuf *err)
 {
 	if (!color_value)
@@ -264,7 +264,7 @@ static int refname_atom_parser_internal(struct refname_atom *atom, const char *a
 	return 0;
 }
 
-static int remote_ref_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int remote_ref_atom_parser(struct ref_format *format, struct used_atom *atom,
 				  const char *arg, struct strbuf *err)
 {
 	struct string_list params = STRING_LIST_INIT_DUP;
@@ -311,7 +311,7 @@ static int remote_ref_atom_parser(const struct ref_format *format, struct used_a
 	return 0;
 }
 
-static int objecttype_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int objecttype_atom_parser(struct ref_format *format, struct used_atom *atom,
 				  const char *arg, struct strbuf *err)
 {
 	if (arg)
@@ -323,7 +323,7 @@ static int objecttype_atom_parser(const struct ref_format *format, struct used_a
 	return 0;
 }
 
-static int objectsize_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int objectsize_atom_parser(struct ref_format *format, struct used_atom *atom,
 				  const char *arg, struct strbuf *err)
 {
 	if (!arg) {
@@ -343,7 +343,7 @@ static int objectsize_atom_parser(const struct ref_format *format, struct used_a
 	return 0;
 }
 
-static int deltabase_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int deltabase_atom_parser(struct ref_format *format, struct used_atom *atom,
 				 const char *arg, struct strbuf *err)
 {
 	if (arg)
@@ -355,7 +355,7 @@ static int deltabase_atom_parser(const struct ref_format *format, struct used_at
 	return 0;
 }
 
-static int body_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int body_atom_parser(struct ref_format *format, struct used_atom *atom,
 			    const char *arg, struct strbuf *err)
 {
 	if (arg)
@@ -364,7 +364,7 @@ static int body_atom_parser(const struct ref_format *format, struct used_atom *a
 	return 0;
 }
 
-static int subject_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int subject_atom_parser(struct ref_format *format, struct used_atom *atom,
 			       const char *arg, struct strbuf *err)
 {
 	if (!arg)
@@ -376,7 +376,7 @@ static int subject_atom_parser(const struct ref_format *format, struct used_atom
 	return 0;
 }
 
-static int trailers_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int trailers_atom_parser(struct ref_format *format, struct used_atom *atom,
 				const char *arg, struct strbuf *err)
 {
 	atom->u.contents.trailer_opts.no_divider = 1;
@@ -402,7 +402,7 @@ static int trailers_atom_parser(const struct ref_format *format, struct used_ato
 	return 0;
 }
 
-static int contents_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int contents_atom_parser(struct ref_format *format, struct used_atom *atom,
 				const char *arg, struct strbuf *err)
 {
 	if (!arg)
@@ -430,7 +430,7 @@ static int contents_atom_parser(const struct ref_format *format, struct used_ato
 	return 0;
 }
 
-static int raw_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int raw_atom_parser(struct ref_format *format, struct used_atom *atom,
 				const char *arg, struct strbuf *err)
 {
 	if (!arg)
@@ -442,7 +442,7 @@ static int raw_atom_parser(const struct ref_format *format, struct used_atom *at
 	return 0;
 }
 
-static int oid_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int oid_atom_parser(struct ref_format *format, struct used_atom *atom,
 			   const char *arg, struct strbuf *err)
 {
 	if (!arg)
@@ -461,7 +461,7 @@ static int oid_atom_parser(const struct ref_format *format, struct used_atom *at
 	return 0;
 }
 
-static int person_email_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int person_email_atom_parser(struct ref_format *format, struct used_atom *atom,
 				    const char *arg, struct strbuf *err)
 {
 	if (!arg)
@@ -475,7 +475,7 @@ static int person_email_atom_parser(const struct ref_format *format, struct used
 	return 0;
 }
 
-static int refname_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int refname_atom_parser(struct ref_format *format, struct used_atom *atom,
 			       const char *arg, struct strbuf *err)
 {
 	return refname_atom_parser_internal(&atom->u.refname, arg, atom->name, err);
@@ -492,7 +492,7 @@ static align_type parse_align_position(const char *s)
 	return -1;
 }
 
-static int align_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int align_atom_parser(struct ref_format *format, struct used_atom *atom,
 			     const char *arg, struct strbuf *err)
 {
 	struct align *align = &atom->u.align;
@@ -544,7 +544,7 @@ static int align_atom_parser(const struct ref_format *format, struct used_atom *
 	return 0;
 }
 
-static int if_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int if_atom_parser(struct ref_format *format, struct used_atom *atom,
 			  const char *arg, struct strbuf *err)
 {
 	if (!arg) {
@@ -559,7 +559,7 @@ static int if_atom_parser(const struct ref_format *format, struct used_atom *ato
 	return 0;
 }
 
-static int head_atom_parser(const struct ref_format *format, struct used_atom *atom,
+static int head_atom_parser(struct ref_format *format, struct used_atom *atom,
 			    const char *arg, struct strbuf *unused_err)
 {
 	atom->u.head = resolve_refdup("HEAD", RESOLVE_REF_READING, NULL, NULL);
@@ -570,7 +570,7 @@ static struct {
 	const char *name;
 	info_source source;
 	cmp_type cmp_type;
-	int (*parser)(const struct ref_format *format, struct used_atom *atom,
+	int (*parser)(struct ref_format *format, struct used_atom *atom,
 		      const char *arg, struct strbuf *err);
 } valid_atom[] = {
 	[ATOM_REFNAME] = { "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser },
@@ -653,7 +653,7 @@ struct atom_value {
 /*
  * Used to parse format string and sort specifiers
  */
-static int parse_ref_filter_atom(const struct ref_format *format,
+static int parse_ref_filter_atom(struct ref_format *format,
 				 const char *atom, const char *ep,
 				 struct strbuf *err)
 {
@@ -2556,9 +2556,9 @@ static void append_literal(const char *cp, const char *ep, struct ref_formatting
 }
 
 int format_ref_array_item(struct ref_array_item *info,
-			   const struct ref_format *format,
-			   struct strbuf *final_buf,
-			   struct strbuf *error_buf)
+			  struct ref_format *format,
+			  struct strbuf *final_buf,
+			  struct strbuf *error_buf)
 {
 	const char *cp, *sp, *ep;
 	struct ref_formatting_state state = REF_FORMATTING_STATE_INIT;
@@ -2602,7 +2602,7 @@ int format_ref_array_item(struct ref_array_item *info,
 }
 
 void pretty_print_ref(const char *name, const struct object_id *oid,
-		      const struct ref_format *format)
+		      struct ref_format *format)
 {
 	struct ref_array_item *ref_item;
 	struct strbuf output = STRBUF_INIT;
diff --git a/ref-filter.h b/ref-filter.h
index baf72a71896..74fb423fc89 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -116,7 +116,7 @@ void ref_array_sort(struct ref_sorting *sort, struct ref_array *array);
 void ref_sorting_set_sort_flags_all(struct ref_sorting *sorting, unsigned int mask, int on);
 /*  Based on the given format and quote_style, fill the strbuf */
 int format_ref_array_item(struct ref_array_item *info,
-			  const struct ref_format *format,
+			  struct ref_format *format,
 			  struct strbuf *final_buf,
 			  struct strbuf *error_buf);
 /*  Parse a single sort specifier and add it to the list */
@@ -137,7 +137,7 @@ void setup_ref_filter_porcelain_msg(void);
  * name must be a fully qualified refname.
  */
 void pretty_print_ref(const char *name, const struct object_id *oid,
-		      const struct ref_format *format);
+		      struct ref_format *format);
 
 /*
  * Push a single ref onto the array; this can be used to construct your own
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v4 5/5] [GSOC] ref-filter: add %(rest) atom
  2021-07-26  3:26     ` [PATCH v4 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms ZheNing Hu via GitGitGadget
                         ` (3 preceding siblings ...)
  2021-07-26  3:26       ` [PATCH v4 4/5] [GSOC] ref-filter: use non-const ref_format in *_atom_parser() ZheNing Hu via GitGitGadget
@ 2021-07-26  3:26       ` ZheNing Hu via GitGitGadget
  4 siblings, 0 replies; 45+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-07-26  3:26 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Christian Couder, Hariom Verma, Bagas Sanjaya,
	Jeff King, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, Jacob Keller, ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

%(rest) is a atom used for cat-file batch mode, which can split
the input lines at the first whitespace boundary, all characters
before that whitespace are considered to be the object name;
characters after that first run of whitespace (i.e., the "rest"
of the line) are output in place of the %(rest) atom.

In order to let "cat-file --batch=%(rest)" use the ref-filter
interface, add %(rest) atom for ref-filter.

Introduce the reject_atom() to reject the atom %(rest) for
"git for-each-ref", "git branch", "git tag" and "git verify-tag".

Reviewed-by: Jacob Keller <jacob.keller@gmail.com>
Suggected-by: Jacob Keller <jacob.keller@gmail.com>
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 ref-filter.c             | 25 +++++++++++++++++++++++++
 ref-filter.h             |  5 ++++-
 t/t3203-branch-output.sh |  4 ++++
 t/t6300-for-each-ref.sh  |  4 ++++
 t/t7004-tag.sh           |  4 ++++
 t/t7030-verify-tag.sh    |  4 ++++
 6 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/ref-filter.c b/ref-filter.c
index f944160c5ce..2cacee9e64a 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -157,6 +157,7 @@ enum atom_type {
 	ATOM_IF,
 	ATOM_THEN,
 	ATOM_ELSE,
+	ATOM_REST,
 };
 
 /*
@@ -559,6 +560,15 @@ static int if_atom_parser(struct ref_format *format, struct used_atom *atom,
 	return 0;
 }
 
+static int rest_atom_parser(struct ref_format *format, struct used_atom *atom,
+			    const char *arg, struct strbuf *err)
+{
+	if (arg)
+		return strbuf_addf_ret(err, -1, _("%%(rest) does not take arguments"));
+	format->use_rest = 1;
+	return 0;
+}
+
 static int head_atom_parser(struct ref_format *format, struct used_atom *atom,
 			    const char *arg, struct strbuf *unused_err)
 {
@@ -615,6 +625,7 @@ static struct {
 	[ATOM_IF] = { "if", SOURCE_NONE, FIELD_STR, if_atom_parser },
 	[ATOM_THEN] = { "then", SOURCE_NONE },
 	[ATOM_ELSE] = { "else", SOURCE_NONE },
+	[ATOM_REST] = { "rest", SOURCE_NONE, FIELD_STR, rest_atom_parser },
 	/*
 	 * Please update $__git_ref_fieldlist in git-completion.bash
 	 * when you add new atoms
@@ -992,6 +1003,11 @@ static const char *find_next(const char *cp)
 	return NULL;
 }
 
+static int reject_atom(enum atom_type atom_type)
+{
+	return atom_type == ATOM_REST;
+}
+
 /*
  * Make sure the format string is well formed, and parse out
  * the used atoms.
@@ -1012,6 +1028,8 @@ int verify_ref_format(struct ref_format *format)
 		at = parse_ref_filter_atom(format, sp + 2, ep, &err);
 		if (at < 0)
 			die("%s", err.buf);
+		if (reject_atom(used_atom[at].atom_type))
+			die(_("this command reject atom %%(%.*s)"), (int)(ep - sp - 2), sp + 2);
 
 		if ((format->quote_style == QUOTE_PYTHON ||
 		     format->quote_style == QUOTE_SHELL ||
@@ -1931,6 +1949,12 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 			v->handler = else_atom_handler;
 			v->s = xstrdup("");
 			continue;
+		} else if (atom_type == ATOM_REST) {
+			if (ref->rest)
+				v->s = xstrdup(ref->rest);
+			else
+				v->s = xstrdup("");
+			continue;
 		} else
 			continue;
 
@@ -2148,6 +2172,7 @@ static struct ref_array_item *new_ref_array_item(const char *refname,
 
 	FLEX_ALLOC_STR(ref, refname, refname);
 	oidcpy(&ref->objectname, oid);
+	ref->rest = NULL;
 
 	return ref;
 }
diff --git a/ref-filter.h b/ref-filter.h
index 74fb423fc89..c15dee8d6b9 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -38,6 +38,7 @@ struct ref_sorting {
 
 struct ref_array_item {
 	struct object_id objectname;
+	const char *rest;
 	int flag;
 	unsigned int kind;
 	const char *symref;
@@ -76,14 +77,16 @@ struct ref_format {
 	 * verify_ref_format() afterwards to finalize.
 	 */
 	const char *format;
+	const char *rest;
 	int quote_style;
+	int use_rest;
 	int use_color;
 
 	/* Internal state to ref-filter */
 	int need_color_reset_at_eol;
 };
 
-#define REF_FORMAT_INIT { NULL, 0, -1 }
+#define REF_FORMAT_INIT { .use_color = -1 }
 
 /*  Macros for checking --merged and --no-merged options */
 #define _OPT_MERGED_NO_MERGED(option, filter, h) \
diff --git a/t/t3203-branch-output.sh b/t/t3203-branch-output.sh
index 5325b9f67a0..6e94c6db7b5 100755
--- a/t/t3203-branch-output.sh
+++ b/t/t3203-branch-output.sh
@@ -340,6 +340,10 @@ test_expect_success 'git branch --format option' '
 	test_cmp expect actual
 '
 
+test_expect_success 'git branch with --format=%(rest) must fail' '
+	test_must_fail git branch --format="%(rest)" >actual
+'
+
 test_expect_success 'worktree colors correct' '
 	cat >expect <<-EOF &&
 	* <GREEN>(HEAD detached from fromtag)<RESET>
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index 3d15d0a5360..0d2e062f791 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -1211,6 +1211,10 @@ test_expect_success 'basic atom: head contents:trailers' '
 	test_cmp expect actual.clean
 '
 
+test_expect_success 'basic atom: rest must fail' '
+	test_must_fail git for-each-ref --format="%(rest)" refs/heads/main
+'
+
 test_expect_success 'trailer parsing not fooled by --- line' '
 	git commit --allow-empty -F - <<-\EOF &&
 	this is the subject
diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh
index 2f72c5c6883..082be85dffc 100755
--- a/t/t7004-tag.sh
+++ b/t/t7004-tag.sh
@@ -1998,6 +1998,10 @@ test_expect_success '--format should list tags as per format given' '
 	test_cmp expect actual
 '
 
+test_expect_success 'git tag -l with --format="%(rest)" must fail' '
+	test_must_fail git tag -l --format="%(rest)" "v1*"
+'
+
 test_expect_success "set up color tests" '
 	echo "<RED>v1.0<RESET>" >expect.color &&
 	echo "v1.0" >expect.bare &&
diff --git a/t/t7030-verify-tag.sh b/t/t7030-verify-tag.sh
index 3cefde9602b..10faa645157 100755
--- a/t/t7030-verify-tag.sh
+++ b/t/t7030-verify-tag.sh
@@ -194,6 +194,10 @@ test_expect_success GPG 'verifying tag with --format' '
 	test_cmp expect actual
 '
 
+test_expect_success GPG 'verifying tag with --format="%(rest)" must fail' '
+	test_must_fail git verify-tag --format="%(rest)" "fourth-signed"
+'
+
 test_expect_success GPG 'verifying a forged tag with --format should fail silently' '
 	test_must_fail git verify-tag --format="tagname : %(tag)" $(cat forged1.tag) >actual-forged &&
 	test_must_be_empty actual-forged
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 5/5] [GSOC] ref-filter: add %(rest) atom
  2021-07-25  8:29       ` Jacob Keller
@ 2021-07-26 17:34         ` Junio C Hamano
  0 siblings, 0 replies; 45+ messages in thread
From: Junio C Hamano @ 2021-07-26 17:34 UTC (permalink / raw)
  To: Jacob Keller
  Cc: ZheNing Hu via GitGitGadget, Git mailing list, Christian Couder,
	Hariom Verma, Bagas Sanjaya, Jeff King,
	Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, ZheNing Hu

Jacob Keller <jacob.keller@gmail.com> writes:

> On Sat, Jul 24, 2021 at 7:14 AM ZheNing Hu via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>>
>> From: ZheNing Hu <adlternative@gmail.com>
>>
>> %(rest) is a atom used for cat-file batch mode, which can split
>> the input lines at the first whitespace boundary, all characters
>> before that whitespace are considered to be the object name;
>> characters after that first run of whitespace (i.e., the "rest"
>> of the line) are output in place of the %(rest) atom.
>>
>> In order to let "cat-file --batch=%(rest)" use the ref-filter
>> interface, add %(rest) atom for ref-filter.
>>
>> Introduce the reject_atom() to reject the atom %(rest) for
>> "git for-each-ref", "git branch", "git tag" and "git verify-tag".
>>
>> Suggected-by: Jacob Keller <jacob.keller@gmail.com>
>> Mentored-by: Christian Couder <christian.couder@gmail.com>
>> Mentored-by: Hariom Verma <hariom18599@gmail.com>
>> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
>> ---
>
> Thanks for the improved  commit message here. That helps clarify what
> this atom does and aids in understanding the implementation.
>
> Reviewed-by: Jacob Keller <jacob.keller@gmail.com>

Thanks, both.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v4 1/5] [GSOC] ref-filter: add obj-type check in grab contents
  2021-07-26  3:26       ` [PATCH v4 1/5] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
@ 2021-07-26 19:15         ` Junio C Hamano
  2021-07-27  1:41           ` ZheNing Hu
  0 siblings, 1 reply; 45+ messages in thread
From: Junio C Hamano @ 2021-07-26 19:15 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Christian Couder, Hariom Verma, Bagas Sanjaya, Jeff King,
	Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, Jacob Keller, ZheNing Hu

"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: ZheNing Hu <adlternative@gmail.com>
>
> Only tag and commit objects use `grab_sub_body_contents()` to grab
> object contents in the current codebase.  We want to teach the
> function to also handle blobs and trees to get their raw data,
> without parsing a blob (whose contents looks like a commit or a tag)
> incorrectly as a commit or a tag. So it's needed to pass a
> `struct expand_data *data` instread of only `void *buf` to both
> `grab_sub_body_contents()` and `grab_values()` to be able to check
> the object type.
>
> Skip the block of code that is specific to handling commits and tags
> early when the given object is of a wrong type to help later
> addition to handle other types of objects in this function.
>
> Reviewed-by: Jacob Keller <jacob.keller@gmail.com>
> Mentored-by: Christian Couder <christian.couder@gmail.com>
> Mentored-by: Hariom Verma <hariom18599@gmail.com>
> Helped-by: Junio C Hamano <gitster@pobox.com>
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>

It is not a huge deal, but are these trailers given in a sensible
order?  The order of event happened were that help and mentoring
were given, your finished version of the patch came after that to be
signed off by you, and then a reviewer gave you a reviewed-by, no?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v4 1/5] [GSOC] ref-filter: add obj-type check in grab contents
  2021-07-26 19:15         ` Junio C Hamano
@ 2021-07-27  1:41           ` ZheNing Hu
  0 siblings, 0 replies; 45+ messages in thread
From: ZheNing Hu @ 2021-07-27  1:41 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: ZheNing Hu via GitGitGadget, Git List, Christian Couder,
	Hariom Verma, Bagas Sanjaya, Jeff King,
	Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Philip Oakley, Jacob Keller

Junio C Hamano <gitster@pobox.com> 于2021年7月27日周二 上午3:15写道:
>
> "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > Only tag and commit objects use `grab_sub_body_contents()` to grab
> > object contents in the current codebase.  We want to teach the
> > function to also handle blobs and trees to get their raw data,
> > without parsing a blob (whose contents looks like a commit or a tag)
> > incorrectly as a commit or a tag. So it's needed to pass a
> > `struct expand_data *data` instread of only `void *buf` to both
> > `grab_sub_body_contents()` and `grab_values()` to be able to check
> > the object type.
> >
> > Skip the block of code that is specific to handling commits and tags
> > early when the given object is of a wrong type to help later
> > addition to handle other types of objects in this function.
> >
> > Reviewed-by: Jacob Keller <jacob.keller@gmail.com>
> > Mentored-by: Christian Couder <christian.couder@gmail.com>
> > Mentored-by: Hariom Verma <hariom18599@gmail.com>
> > Helped-by: Junio C Hamano <gitster@pobox.com>
> > Signed-off-by: ZheNing Hu <adlternative@gmail.com>
>
> It is not a huge deal, but are these trailers given in a sensible
> order?  The order of event happened were that help and mentoring
> were given, your finished version of the patch came after that to be
> signed off by you, and then a reviewer gave you a reviewed-by, no?

Indeed so.

Thanks.
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2021-07-27  1:40 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-22  7:39 [PATCH 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms ZheNing Hu via GitGitGadget
2021-07-22  7:39 ` [PATCH 1/5] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
2021-07-22  8:51   ` Christian Couder
2021-07-22  9:15     ` ZheNing Hu
2021-07-22  7:39 ` [PATCH 2/5] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
2021-07-22  8:06   ` Ævar Arnfjörð Bjarmason
2021-07-22  8:31     ` ZheNing Hu
2021-07-22  7:39 ` [PATCH 3/5] [GSOC] ref-filter: --format=%(raw) re-support --perl ZheNing Hu via GitGitGadget
2021-07-22  7:39 ` [PATCH 4/5] [GSOC] ref-filter: use non-const ref_format in *_atom_parser() ZheNing Hu via GitGitGadget
2021-07-22  7:39 ` [PATCH 5/5] [GSOC] ref-filter: add %(rest) atom ZheNing Hu via GitGitGadget
2021-07-22  8:11   ` Ævar Arnfjörð Bjarmason
2021-07-22  9:10     ` ZheNing Hu
2021-07-23  9:09   ` Jacob Keller
2021-07-23 14:11     ` ZheNing Hu
2021-07-22  8:08 ` [PATCH 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms Christian Couder
2021-07-22  8:22   ` ZheNing Hu
2021-07-23  9:03 ` [PATCH v2 " ZheNing Hu via GitGitGadget
2021-07-23  9:03   ` [PATCH v2 1/5] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
2021-07-23  9:04   ` [PATCH v2 2/5] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
2021-07-23 16:38     ` Junio C Hamano
2021-07-24  7:57       ` ZheNing Hu
2021-07-24 17:41         ` Junio C Hamano
2021-07-25 12:47           ` ZheNing Hu
2021-07-23  9:04   ` [PATCH v2 3/5] [GSOC] ref-filter: --format=%(raw) re-support --perl ZheNing Hu via GitGitGadget
2021-07-23  9:04   ` [PATCH v2 4/5] [GSOC] ref-filter: use non-const ref_format in *_atom_parser() ZheNing Hu via GitGitGadget
2021-07-23  9:04   ` [PATCH v2 5/5] [GSOC] ref-filter: add %(rest) atom ZheNing Hu via GitGitGadget
2021-07-24 14:14   ` [PATCH v3 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms ZheNing Hu via GitGitGadget
2021-07-24 14:14     ` [PATCH v3 1/5] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
2021-07-25  8:22       ` Jacob Keller
2021-07-24 14:14     ` [PATCH v3 2/5] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
2021-07-24 14:14     ` [PATCH v3 3/5] [GSOC] ref-filter: --format=%(raw) re-support --perl ZheNing Hu via GitGitGadget
2021-07-25  8:27       ` Jacob Keller
2021-07-25 13:18         ` ZheNing Hu
2021-07-24 14:14     ` [PATCH v3 4/5] [GSOC] ref-filter: use non-const ref_format in *_atom_parser() ZheNing Hu via GitGitGadget
2021-07-24 14:14     ` [PATCH v3 5/5] [GSOC] ref-filter: add %(rest) atom ZheNing Hu via GitGitGadget
2021-07-25  8:29       ` Jacob Keller
2021-07-26 17:34         ` Junio C Hamano
2021-07-26  3:26     ` [PATCH v4 0/5] [GSOC] ref-filter: add %(raw) and %(rest) atoms ZheNing Hu via GitGitGadget
2021-07-26  3:26       ` [PATCH v4 1/5] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
2021-07-26 19:15         ` Junio C Hamano
2021-07-27  1:41           ` ZheNing Hu
2021-07-26  3:26       ` [PATCH v4 2/5] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
2021-07-26  3:26       ` [PATCH v4 3/5] [GSOC] ref-filter: --format=%(raw) support --perl ZheNing Hu via GitGitGadget
2021-07-26  3:26       ` [PATCH v4 4/5] [GSOC] ref-filter: use non-const ref_format in *_atom_parser() ZheNing Hu via GitGitGadget
2021-07-26  3:26       ` [PATCH v4 5/5] [GSOC] ref-filter: add %(rest) atom ZheNing Hu via GitGitGadget

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.