git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] [GSOC][RFC] ref-filter: introduce enum atom_type
@ 2021-05-08 15:22 ZheNing Hu via GitGitGadget
  2021-05-08 15:22 ` [PATCH 1/2] [GSOC] ref-filter: add objectsize to used_atom ZheNing Hu via GitGitGadget
                   ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-05-08 15:22 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Christian Couder, Hariom Verma, ZheNing Hu

Firstly, let the union used_atom.u add a member "objectsize",which learn to
record the attributes of "objectsize", this will bring an extension line for
atom "%(objectsize)". Next patch will base on this feature to better support
its functions.

Secondly, In
https://lore.kernel.org/git/CAOLTT8RhVZQJX8z1gY5UM1jv0imZ4K9UnD14MgJFfvqBBiAZQg@mail.gmail.com/
I and Junio discussed the use of enum atom_type in used_atom to represent
the specific type of union used_atom.u, which can correctly distinguish atom
types and reduce the overhead of string matching.

But after my attempts, I found that our used_atom is better using enum to
remember the type of valid_atom item instead of remember type of
used_atom.u, which can be used throughout the ref-filter, easily distinguish
different types of atoms through a used_atom[i].atom_type, and reduce the
overhead of string matching.

What's left: some function like grab_oid() or grab_person(), they can also
take advantage of this feature, but require some additional modifications.

What's for future: Let used_atom entry can index directly by enum atom_type,
In order to reduce the very many loops in ref-filter.

E.g.

static struct used_atom_cache { enum atom_type atom_type; struct used_atom
*used_atom; } used_atom_cache[] = { {ATOM_REFNAME, NULL}, {ATOM_OBJECTTYPE,
NULL}, ... };

If want check whether used_atom entries have a atom "%(refname)", we can
just check if
used_atom_cache[ATOM_REFNAME].used_atom is equal to NULL, the time
complexity is O(1) and the complexity of the original method is O(n). This
will once again get a performance boost for ref-filter.

ZheNing Hu (2):
  [GSOC] ref-filter: add objectsize to used_atom
  [GSOC][RFC] ref-filter: introduce enum atom_type

 ref-filter.c | 209 +++++++++++++++++++++++++++++++++------------------
 1 file changed, 135 insertions(+), 74 deletions(-)


base-commit: 7e391989789db82983665667013a46eabc6fc570
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-951%2Fadlternative%2Fref-filter-atom-type-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-951/adlternative/ref-filter-atom-type-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/951
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 1/2] [GSOC] ref-filter: add objectsize to used_atom
  2021-05-08 15:22 [PATCH 0/2] [GSOC][RFC] ref-filter: introduce enum atom_type ZheNing Hu via GitGitGadget
@ 2021-05-08 15:22 ` ZheNing Hu via GitGitGadget
  2021-05-08 15:22 ` [PATCH 2/2] [GSOC][RFC] ref-filter: introduce enum atom_type ZheNing Hu via GitGitGadget
  2021-05-10 15:03 ` [PATCH v2 0/2] " ZheNing Hu via GitGitGadget
  2 siblings, 0 replies; 27+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-05-08 15:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Christian Couder, Hariom Verma,
	ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Since "objectsize:size" is composed of two parts,
"type:attribute". However, the original implementation
did not decouple the two parts "type" and "attribute" well,
we still need to judge separately whether the atom is
"objectsize" or "objectsize:disk" in `grab_common_values()`.

Add a new member `objectsize` to the union `used_atom.u`,
so that we can separate the judgment of atom type from the
judgment of atom attribute, This will bring scalability to
atom `%(objectsize)`.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 ref-filter.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index a0adb4551d87..f420bae6e5ba 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -146,6 +146,9 @@ static struct used_atom {
 			enum { O_FULL, O_LENGTH, O_SHORT } option;
 			unsigned int length;
 		} oid;
+		struct {
+			enum { O_SIZE, O_SIZE_DISK } option;
+		} objectsize;
 		struct email_option {
 			enum { EO_RAW, EO_TRIM, EO_LOCALPART } option;
 		} email_option;
@@ -269,11 +272,13 @@ static int objectsize_atom_parser(const struct ref_format *format, struct used_a
 				  const char *arg, struct strbuf *err)
 {
 	if (!arg) {
+		atom->u.objectsize.option = O_SIZE;
 		if (*atom->name == '*')
 			oi_deref.info.sizep = &oi_deref.size;
 		else
 			oi.info.sizep = &oi.size;
 	} else if (!strcmp(arg, "disk")) {
+		atom->u.objectsize.option = O_SIZE_DISK;
 		if (*atom->name == '*')
 			oi_deref.info.disk_sizep = &oi_deref.disk_size;
 		else
@@ -967,12 +972,14 @@ static void grab_common_values(struct atom_value *val, int deref, struct expand_
 			name++;
 		if (!strcmp(name, "objecttype"))
 			v->s = xstrdup(type_name(oi->type));
-		else if (!strcmp(name, "objectsize:disk")) {
-			v->value = oi->disk_size;
-			v->s = xstrfmt("%"PRIuMAX, (uintmax_t)oi->disk_size);
-		} else if (!strcmp(name, "objectsize")) {
-			v->value = oi->size;
-			v->s = xstrfmt("%"PRIuMAX , (uintmax_t)oi->size);
+		else if (starts_with(name, "objectsize")) {
+			if (used_atom[i].u.objectsize.option == O_SIZE_DISK) {
+				v->value = oi->disk_size;
+				v->s = xstrfmt("%"PRIuMAX, (uintmax_t)oi->disk_size);
+			} else if (used_atom[i].u.objectsize.option == O_SIZE) {
+				v->value = oi->size;
+				v->s = xstrfmt("%"PRIuMAX , (uintmax_t)oi->size);
+			}
 		} else if (!strcmp(name, "deltabase"))
 			v->s = xstrdup(oid_to_hex(&oi->delta_base_oid));
 		else if (deref)
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 2/2] [GSOC][RFC] ref-filter: introduce enum atom_type
  2021-05-08 15:22 [PATCH 0/2] [GSOC][RFC] ref-filter: introduce enum atom_type ZheNing Hu via GitGitGadget
  2021-05-08 15:22 ` [PATCH 1/2] [GSOC] ref-filter: add objectsize to used_atom ZheNing Hu via GitGitGadget
@ 2021-05-08 15:22 ` ZheNing Hu via GitGitGadget
  2021-05-09  6:21   ` Christian Couder
  2021-05-10 15:03 ` [PATCH v2 0/2] " ZheNing Hu via GitGitGadget
  2 siblings, 1 reply; 27+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-05-08 15:22 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Christian Couder, Hariom Verma,
	ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

In the original ref-filter design, it will copy the parsed
atom's name and attributes to `used_atom[i].name` in the
atom's parsing step, and use it again for string matching
in the later specific ref attributes filling step. It use
a lot of string matching to determine which atom we need.

Introduce the enum member `valid_atom.atom_type` which
record type of each valid_atom, in the first step of the
atom parsing, `used_atom.atom_type` will record corresponding
enum value from `valid_atom.atom_type`, and then in specific
reference attribute filling step, only need to compare the
value of the `used_atom.atom_type` to judge the atom type.

At the same time, The value of an atom_type is the coordinate
of its corresponding valid_atom entry, we can quickly index
to the corresponding valid_atom entry by the atom_type value.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 ref-filter.c | 192 +++++++++++++++++++++++++++++++++------------------
 1 file changed, 123 insertions(+), 69 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index f420bae6e5ba..4ce46e70dc99 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -108,6 +108,50 @@ static struct ref_to_worktree_map {
 	struct worktree **worktrees;
 } ref_to_worktree_map;
 
+enum atom_type {
+ATOM_REFNAME,
+ATOM_OBJECTTYPE,
+ATOM_OBJECTSIZE,
+ATOM_OBJECTNAME,
+ATOM_DELTABASE,
+ATOM_TREE,
+ATOM_PARENT,
+ATOM_NUMPARENT,
+ATOM_OBJECT,
+ATOM_TYPE,
+ATOM_TAG,
+ATOM_AUTHOR,
+ATOM_AUTHORNAME,
+ATOM_AUTHOREMAIL,
+ATOM_AUTHORDATE,
+ATOM_COMMITTER,
+ATOM_COMMITTERNAME,
+ATOM_COMMITTEREMAIL,
+ATOM_COMMITTERDATE,
+ATOM_TAGGER,
+ATOM_TAGGERNAME,
+ATOM_TAGGEREMAIL,
+ATOM_TAGGERDATE,
+ATOM_CREATOR,
+ATOM_CREATORDATE,
+ATOM_SUBJECT,
+ATOM_BODY,
+ATOM_TRAILERS,
+ATOM_CONTENTS,
+ATOM_UPSTREAM,
+ATOM_PUSH,
+ATOM_SYMREF,
+ATOM_FLAG,
+ATOM_HEAD,
+ATOM_COLOR,
+ATOM_WORKTREEPATH,
+ATOM_ALIGN,
+ATOM_END,
+ATOM_IF,
+ATOM_THEN,
+ATOM_ELSE,
+};
+
 /*
  * An atom is a valid field atom listed below, possibly prefixed with
  * a "*" to denote deref_tag().
@@ -122,6 +166,7 @@ static struct used_atom {
 	const char *name;
 	cmp_type type;
 	info_source source;
+	enum atom_type atom_type;
 	union {
 		char color[COLOR_MAXLEN];
 		struct align align;
@@ -500,53 +545,54 @@ static int head_atom_parser(const struct ref_format *format, struct used_atom *a
 }
 
 static struct {
+	enum atom_type atom_type;
 	const char *name;
 	info_source source;
 	cmp_type cmp_type;
 	int (*parser)(const struct ref_format *format, struct used_atom *atom,
 		      const char *arg, struct strbuf *err);
 } valid_atom[] = {
-	{ "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser },
-	{ "objecttype", SOURCE_OTHER, FIELD_STR, objecttype_atom_parser },
-	{ "objectsize", SOURCE_OTHER, FIELD_ULONG, objectsize_atom_parser },
-	{ "objectname", SOURCE_OTHER, FIELD_STR, oid_atom_parser },
-	{ "deltabase", SOURCE_OTHER, FIELD_STR, deltabase_atom_parser },
-	{ "tree", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
-	{ "parent", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
-	{ "numparent", SOURCE_OBJ, FIELD_ULONG },
-	{ "object", SOURCE_OBJ },
-	{ "type", SOURCE_OBJ },
-	{ "tag", SOURCE_OBJ },
-	{ "author", SOURCE_OBJ },
-	{ "authorname", SOURCE_OBJ },
-	{ "authoremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
-	{ "authordate", SOURCE_OBJ, FIELD_TIME },
-	{ "committer", SOURCE_OBJ },
-	{ "committername", SOURCE_OBJ },
-	{ "committeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
-	{ "committerdate", SOURCE_OBJ, FIELD_TIME },
-	{ "tagger", SOURCE_OBJ },
-	{ "taggername", SOURCE_OBJ },
-	{ "taggeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
-	{ "taggerdate", SOURCE_OBJ, FIELD_TIME },
-	{ "creator", SOURCE_OBJ },
-	{ "creatordate", SOURCE_OBJ, FIELD_TIME },
-	{ "subject", SOURCE_OBJ, FIELD_STR, subject_atom_parser },
-	{ "body", SOURCE_OBJ, FIELD_STR, body_atom_parser },
-	{ "trailers", SOURCE_OBJ, FIELD_STR, trailers_atom_parser },
-	{ "contents", SOURCE_OBJ, FIELD_STR, contents_atom_parser },
-	{ "upstream", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
-	{ "push", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
-	{ "symref", SOURCE_NONE, FIELD_STR, refname_atom_parser },
-	{ "flag", SOURCE_NONE },
-	{ "HEAD", SOURCE_NONE, FIELD_STR, head_atom_parser },
-	{ "color", SOURCE_NONE, FIELD_STR, color_atom_parser },
-	{ "worktreepath", SOURCE_NONE },
-	{ "align", SOURCE_NONE, FIELD_STR, align_atom_parser },
-	{ "end", SOURCE_NONE },
-	{ "if", SOURCE_NONE, FIELD_STR, if_atom_parser },
-	{ "then", SOURCE_NONE },
-	{ "else", SOURCE_NONE },
+	{ ATOM_REFNAME, "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser },
+	{ ATOM_OBJECTTYPE, "objecttype", SOURCE_OTHER, FIELD_STR, objecttype_atom_parser },
+	{ ATOM_OBJECTSIZE, "objectsize", SOURCE_OTHER, FIELD_ULONG, objectsize_atom_parser },
+	{ ATOM_OBJECTNAME, "objectname", SOURCE_OTHER, FIELD_STR, oid_atom_parser },
+	{ ATOM_DELTABASE, "deltabase", SOURCE_OTHER, FIELD_STR, deltabase_atom_parser },
+	{ ATOM_TREE, "tree", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
+	{ ATOM_PARENT, "parent", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
+	{ ATOM_NUMPARENT, "numparent", SOURCE_OBJ, FIELD_ULONG },
+	{ ATOM_OBJECT, "object", SOURCE_OBJ },
+	{ ATOM_TYPE, "type", SOURCE_OBJ },
+	{ ATOM_TAG, "tag", SOURCE_OBJ },
+	{ ATOM_AUTHOR, "author", SOURCE_OBJ },
+	{ ATOM_AUTHORNAME, "authorname", SOURCE_OBJ },
+	{ ATOM_AUTHOREMAIL, "authoremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
+	{ ATOM_AUTHORDATE, "authordate", SOURCE_OBJ, FIELD_TIME },
+	{ ATOM_COMMITTER, "committer", SOURCE_OBJ },
+	{ ATOM_COMMITTERNAME, "committername", SOURCE_OBJ },
+	{ ATOM_COMMITTEREMAIL, "committeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
+	{ ATOM_COMMITTERDATE, "committerdate", SOURCE_OBJ, FIELD_TIME },
+	{ ATOM_TAGGER, "tagger", SOURCE_OBJ },
+	{ ATOM_TAGGERNAME, "taggername", SOURCE_OBJ },
+	{ ATOM_TAGGEREMAIL, "taggeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
+	{ ATOM_TAGGERDATE, "taggerdate", SOURCE_OBJ, FIELD_TIME },
+	{ ATOM_CREATOR, "creator", SOURCE_OBJ },
+	{ ATOM_CREATORDATE, "creatordate", SOURCE_OBJ, FIELD_TIME },
+	{ ATOM_SUBJECT, "subject", SOURCE_OBJ, FIELD_STR, subject_atom_parser },
+	{ ATOM_BODY, "body", SOURCE_OBJ, FIELD_STR, body_atom_parser },
+	{ ATOM_TRAILERS, "trailers", SOURCE_OBJ, FIELD_STR, trailers_atom_parser },
+	{ ATOM_CONTENTS, "contents", SOURCE_OBJ, FIELD_STR, contents_atom_parser },
+	{ ATOM_UPSTREAM, "upstream", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
+	{ ATOM_PUSH, "push", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
+	{ ATOM_SYMREF, "symref", SOURCE_NONE, FIELD_STR, refname_atom_parser },
+	{ ATOM_FLAG, "flag", SOURCE_NONE },
+	{ ATOM_HEAD, "HEAD", SOURCE_NONE, FIELD_STR, head_atom_parser },
+	{ ATOM_COLOR, "color", SOURCE_NONE, FIELD_STR, color_atom_parser },
+	{ ATOM_WORKTREEPATH, "worktreepath", SOURCE_NONE },
+	{ ATOM_ALIGN, "align", SOURCE_NONE, FIELD_STR, align_atom_parser },
+	{ ATOM_END, "end", SOURCE_NONE },
+	{ ATOM_IF, "if", SOURCE_NONE, FIELD_STR, if_atom_parser },
+	{ ATOM_THEN, "then", SOURCE_NONE },
+	{ ATOM_ELSE, "else", SOURCE_NONE },
 	/*
 	 * Please update $__git_ref_fieldlist in git-completion.bash
 	 * when you add new atoms
@@ -628,6 +674,7 @@ static int parse_ref_filter_atom(const struct ref_format *format,
 	at = used_atom_cnt;
 	used_atom_cnt++;
 	REALLOC_ARRAY(used_atom, used_atom_cnt);
+	used_atom[at].atom_type = valid_atom[i].atom_type;
 	used_atom[at].name = xmemdupz(atom, ep - atom);
 	used_atom[at].type = valid_atom[i].cmp_type;
 	used_atom[at].source = valid_atom[i].source;
@@ -652,7 +699,7 @@ static int parse_ref_filter_atom(const struct ref_format *format,
 		return -1;
 	if (*atom == '*')
 		need_tagged = 1;
-	if (!strcmp(valid_atom[i].name, "symref"))
+	if (valid_atom[i].atom_type == ATOM_SYMREF)
 		need_symref = 1;
 	return at;
 }
@@ -965,14 +1012,15 @@ static void grab_common_values(struct atom_value *val, int deref, struct expand_
 
 	for (i = 0; i < used_atom_cnt; i++) {
 		const char *name = used_atom[i].name;
+		enum atom_type atom_type = used_atom[i].atom_type;
 		struct atom_value *v = &val[i];
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
-		if (!strcmp(name, "objecttype"))
+		if (atom_type == ATOM_OBJECTTYPE)
 			v->s = xstrdup(type_name(oi->type));
-		else if (starts_with(name, "objectsize")) {
+		else if (atom_type == ATOM_OBJECTSIZE) {
 			if (used_atom[i].u.objectsize.option == O_SIZE_DISK) {
 				v->value = oi->disk_size;
 				v->s = xstrfmt("%"PRIuMAX, (uintmax_t)oi->disk_size);
@@ -980,9 +1028,9 @@ static void grab_common_values(struct atom_value *val, int deref, struct expand_
 				v->value = oi->size;
 				v->s = xstrfmt("%"PRIuMAX , (uintmax_t)oi->size);
 			}
-		} else if (!strcmp(name, "deltabase"))
+		} else if (atom_type == ATOM_DELTABASE)
 			v->s = xstrdup(oid_to_hex(&oi->delta_base_oid));
-		else if (deref)
+		else if (atom_type == ATOM_OBJECTNAME && deref)
 			grab_oid(name, "objectname", &oi->oid, v, &used_atom[i]);
 	}
 }
@@ -995,16 +1043,17 @@ static void grab_tag_values(struct atom_value *val, int deref, struct object *ob
 
 	for (i = 0; i < used_atom_cnt; i++) {
 		const char *name = used_atom[i].name;
+		enum atom_type atom_type = used_atom[i].atom_type;
 		struct atom_value *v = &val[i];
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
-		if (!strcmp(name, "tag"))
+		if (atom_type == ATOM_TAG)
 			v->s = xstrdup(tag->tag);
-		else if (!strcmp(name, "type") && tag->tagged)
+		else if (atom_type == ATOM_TYPE && tag->tagged)
 			v->s = xstrdup(type_name(tag->tagged->type));
-		else if (!strcmp(name, "object") && tag->tagged)
+		else if (atom_type == ATOM_OBJECT && tag->tagged)
 			v->s = xstrdup(oid_to_hex(&tag->tagged->oid));
 	}
 }
@@ -1017,18 +1066,20 @@ static void grab_commit_values(struct atom_value *val, int deref, struct object
 
 	for (i = 0; i < used_atom_cnt; i++) {
 		const char *name = used_atom[i].name;
+		enum atom_type atom_type = used_atom[i].atom_type;
 		struct atom_value *v = &val[i];
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
-		if (grab_oid(name, "tree", get_commit_tree_oid(commit), v, &used_atom[i]))
+		if (atom_type == ATOM_TREE &&
+		    grab_oid(name, "tree", get_commit_tree_oid(commit), v, &used_atom[i]))
 			continue;
-		if (!strcmp(name, "numparent")) {
+		if (atom_type == ATOM_NUMPARENT) {
 			v->value = commit_list_count(commit->parents);
 			v->s = xstrfmt("%lu", (unsigned long)v->value);
 		}
-		else if (starts_with(name, "parent")) {
+		else if (atom_type == ATOM_PARENT) {
 			struct commit_list *parents;
 			struct strbuf s = STRBUF_INIT;
 			for (parents = commit->parents; parents; parents = parents->next) {
@@ -1208,15 +1259,16 @@ static void grab_person(const char *who, struct atom_value *val, int deref, void
 		return;
 	for (i = 0; i < used_atom_cnt; i++) {
 		const char *name = used_atom[i].name;
+		enum atom_type atom_type = used_atom[i].atom_type;
 		struct atom_value *v = &val[i];
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
 
-		if (starts_with(name, "creatordate"))
+		if (atom_type == ATOM_CREATORDATE)
 			grab_date(wholine, v, name);
-		else if (!strcmp(name, "creator"))
+		else if (atom_type == ATOM_CREATOR)
 			v->s = copy_line(wholine);
 	}
 }
@@ -1696,6 +1748,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 	/* Fill in specials first */
 	for (i = 0; i < used_atom_cnt; i++) {
 		struct used_atom *atom = &used_atom[i];
+		enum atom_type atom_type = atom->atom_type;
 		const char *name = used_atom[i].name;
 		struct atom_value *v = &ref->value[i];
 		int deref = 0;
@@ -1710,18 +1763,18 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 			name++;
 		}
 
-		if (starts_with(name, "refname"))
+		if (atom_type == ATOM_REFNAME)
 			refname = get_refname(atom, ref);
-		else if (!strcmp(name, "worktreepath")) {
+		else if (atom_type == ATOM_WORKTREEPATH) {
 			if (ref->kind == FILTER_REFS_BRANCHES)
 				v->s = get_worktree_path(atom, ref);
 			else
 				v->s = xstrdup("");
 			continue;
 		}
-		else if (starts_with(name, "symref"))
+		else if (atom_type == ATOM_SYMREF)
 			refname = get_symref(atom, ref);
-		else if (starts_with(name, "upstream")) {
+		else if (atom_type == ATOM_UPSTREAM) {
 			const char *branch_name;
 			/* only local branches may have an upstream */
 			if (!skip_prefix(ref->refname, "refs/heads/",
@@ -1737,7 +1790,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 			else
 				v->s = xstrdup("");
 			continue;
-		} else if (atom->u.remote_ref.push) {
+		} else if (atom_type == ATOM_PUSH && atom->u.remote_ref.push) {
 			const char *branch_name;
 			v->s = xstrdup("");
 			if (!skip_prefix(ref->refname, "refs/heads/",
@@ -1756,10 +1809,10 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 			free((char *)v->s);
 			fill_remote_ref_details(atom, refname, branch, &v->s);
 			continue;
-		} else if (starts_with(name, "color:")) {
+		} else if (atom_type == ATOM_COLOR) {
 			v->s = xstrdup(atom->u.color);
 			continue;
-		} else if (!strcmp(name, "flag")) {
+		} else if (atom_type == ATOM_FLAG) {
 			char buf[256], *cp = buf;
 			if (ref->flag & REF_ISSYMREF)
 				cp = copy_advance(cp, ",symref");
@@ -1772,23 +1825,24 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 				v->s = xstrdup(buf + 1);
 			}
 			continue;
-		} else if (!deref && grab_oid(name, "objectname", &ref->objectname, v, atom)) {
-			continue;
-		} else if (!strcmp(name, "HEAD")) {
+		} else if (!deref && atom_type == ATOM_OBJECTNAME &&
+			   grab_oid(name, "objectname", &ref->objectname, v, atom)) {
+				continue;
+		} else if (atom_type == ATOM_HEAD) {
 			if (atom->u.head && !strcmp(ref->refname, atom->u.head))
 				v->s = xstrdup("*");
 			else
 				v->s = xstrdup(" ");
 			continue;
-		} else if (starts_with(name, "align")) {
+		} else if (atom_type == ATOM_ALIGN) {
 			v->handler = align_atom_handler;
 			v->s = xstrdup("");
 			continue;
-		} else if (!strcmp(name, "end")) {
+		} else if (atom_type == ATOM_END) {
 			v->handler = end_atom_handler;
 			v->s = xstrdup("");
 			continue;
-		} else if (starts_with(name, "if")) {
+		} else if (atom_type == ATOM_IF) {
 			const char *s;
 			if (skip_prefix(name, "if:", &s))
 				v->s = xstrdup(s);
@@ -1796,11 +1850,11 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 				v->s = xstrdup("");
 			v->handler = if_atom_handler;
 			continue;
-		} else if (!strcmp(name, "then")) {
+		} else if (atom_type == ATOM_THEN) {
 			v->handler = then_atom_handler;
 			v->s = xstrdup("");
 			continue;
-		} else if (!strcmp(name, "else")) {
+		} else if (atom_type == ATOM_ELSE) {
 			v->handler = else_atom_handler;
 			v->s = xstrdup("");
 			continue;
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] [GSOC][RFC] ref-filter: introduce enum atom_type
  2021-05-08 15:22 ` [PATCH 2/2] [GSOC][RFC] ref-filter: introduce enum atom_type ZheNing Hu via GitGitGadget
@ 2021-05-09  6:21   ` Christian Couder
  2021-05-09  8:26     ` Junio C Hamano
  2021-05-09 13:40     ` ZheNing Hu
  0 siblings, 2 replies; 27+ messages in thread
From: Christian Couder @ 2021-05-09  6:21 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Junio C Hamano, Jeff King, Christian Couder, Hariom Verma,
	ZheNing Hu

On Sat, May 8, 2021 at 5:22 PM ZheNing Hu via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: ZheNing Hu <adlternative@gmail.com>
>
> In the original ref-filter design, it will copy the parsed
> atom's name and attributes to `used_atom[i].name` in the
> atom's parsing step, and use it again for string matching
> in the later specific ref attributes filling step. It use
> a lot of string matching to determine which atom we need.
>
> Introduce the enum member `valid_atom.atom_type` which
> record type of each valid_atom, in the first step of the
> atom parsing, `used_atom.atom_type` will record corresponding
> enum value from `valid_atom.atom_type`, and then in specific
> reference attribute filling step, only need to compare the
> value of the `used_atom.atom_type` to judge the atom type.
>
> At the same time, The value of an atom_type is the coordinate
> of its corresponding valid_atom entry, we can quickly index
> to the corresponding valid_atom entry by the atom_type value.

I am not sure it's worth having an atom_type field for each valid_atom
element if the value of that field is already the index of the
element, because then one would always be able to replace
`valid_atom[i].atom_type` with just `i`. Or is it for some kind of
type safety issue?

> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
>  ref-filter.c | 192 +++++++++++++++++++++++++++++++++------------------
>  1 file changed, 123 insertions(+), 69 deletions(-)
>
> diff --git a/ref-filter.c b/ref-filter.c
> index f420bae6e5ba..4ce46e70dc99 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -108,6 +108,50 @@ static struct ref_to_worktree_map {
>         struct worktree **worktrees;
>  } ref_to_worktree_map;
>
> +enum atom_type {
> +ATOM_REFNAME,
...
+ATOM_ELSE,
+};

I wonder if the enum should be instead defined like this:

enum atom_type {
ATOM_UNKNOWN = 0,
ATOM_REFNAME,
...
ATOM_ELSE,
ATOM_INVALID, /* should be last */
};

As a struct containing an atom_type would typically be initialized
with 0 after being allocated, `ATOM_UNKNOWN = 0` could ensure that we
can easily distinguish such a struct where the atom_type is known from
such a struct where it is unknown yet.

Having ATOM_INVALID as always the last enum value (even if some new
ones are added later) could help us iterate over the valid atoms using
something like:

for (i = ATOM_UNKNOWN + 1; i < ATOM_INVALID; i++)
        /* do something with valid_atom[i] */;

> +
>  /*
>   * An atom is a valid field atom listed below, possibly prefixed with
>   * a "*" to denote deref_tag().
> @@ -122,6 +166,7 @@ static struct used_atom {
>         const char *name;
>         cmp_type type;
>         info_source source;
> +       enum atom_type atom_type;
>         union {
>                 char color[COLOR_MAXLEN];
>                 struct align align;
> @@ -500,53 +545,54 @@ static int head_atom_parser(const struct ref_format *format, struct used_atom *a
>  }
>
>  static struct {
> +       enum atom_type atom_type;
>         const char *name;
>         info_source source;
>         cmp_type cmp_type;

I can see that the fields are already not in the same order as in
struct used_atom, but my opinion is that it would be better if they
would we as much as possible in the same order. Maybe one day we could
even unify these structs in some way.

Also as discussed above we might not even need to add an atom_type to
valid_atom[].

>         int (*parser)(const struct ref_format *format, struct used_atom *atom,
>                       const char *arg, struct strbuf *err);
>  } valid_atom[] = {

> @@ -628,6 +674,7 @@ static int parse_ref_filter_atom(const struct ref_format *format,
>         at = used_atom_cnt;
>         used_atom_cnt++;
>         REALLOC_ARRAY(used_atom, used_atom_cnt);
> +       used_atom[at].atom_type = valid_atom[i].atom_type;

As discussed above, if the value of an atom_type is the coordinate of
its corresponding valid_atom entry, then here the following would be
simpler:

       used_atom[at].atom_type = i;

>         used_atom[at].name = xmemdupz(atom, ep - atom);
>         used_atom[at].type = valid_atom[i].cmp_type;
>         used_atom[at].source = valid_atom[i].source;
> @@ -652,7 +699,7 @@ static int parse_ref_filter_atom(const struct ref_format *format,
>                 return -1;
>         if (*atom == '*')
>                 need_tagged = 1;
> -       if (!strcmp(valid_atom[i].name, "symref"))
> +       if (valid_atom[i].atom_type == ATOM_SYMREF)

In the same way as above, the above line might be replaced with the simpler:

       if (i == ATOM_SYMREF)

>                 need_symref = 1;
>         return at;
>  }

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] [GSOC][RFC] ref-filter: introduce enum atom_type
  2021-05-09  6:21   ` Christian Couder
@ 2021-05-09  8:26     ` Junio C Hamano
  2021-05-09 13:44       ` ZheNing Hu
  2021-05-09 13:40     ` ZheNing Hu
  1 sibling, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2021-05-09  8:26 UTC (permalink / raw)
  To: Christian Couder
  Cc: ZheNing Hu via GitGitGadget, git, Jeff King, Christian Couder,
	Hariom Verma, ZheNing Hu

Christian Couder <christian.couder@gmail.com> writes:

> I am not sure it's worth having an atom_type field for each valid_atom
> element if the value of that field is already the index of the
> element, because then one would always be able to replace
> `valid_atom[i].atom_type` with just `i`. Or is it for some kind of
> type safety issue?

> I wonder if the enum should be instead defined like this:
>
> enum atom_type {
> ATOM_UNKNOWN = 0,
> ATOM_REFNAME,
> ...
> ATOM_ELSE,
> ATOM_INVALID, /* should be last */
> };
>
> As a struct containing an atom_type would typically be initialized
> with 0 after being allocated, `ATOM_UNKNOWN = 0` could ensure that we
> can easily distinguish such a struct where the atom_type is known from
> such a struct where it is unknown yet.
>
> Having ATOM_INVALID as always the last enum value (even if some new
> ones are added later) could help us iterate over the valid atoms using
> something like:
>
> for (i = ATOM_UNKNOWN + 1; i < ATOM_INVALID; i++)
>         /* do something with valid_atom[i] */;

OK.

As to "safety", I think it still makes sense to declare "enum", but
I agree that we do not necessarily have to have it in the valid_atom[]
struct.  We could do something like this instead:

    static struct {
            const char *name;
            info_source source;
            cmp_type cmp_type;
            int (*parser)(const struct ref_format *format, struct used_atom *atom,
                          const char *arg, struct strbuf *err);
    } valid_atom[] = {
           [ATOM_REFNAME] = { "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser },
           [ATOM_OBJECTTYPE] = { "objecttype", SOURCE_OTHER, FIELD_STR, objecttype...
           [ATOM_OBJECTSIZE] = { "objectsize", SOURCE_OTHER, FIELD_ULONG, objectsi...
           ...

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] [GSOC][RFC] ref-filter: introduce enum atom_type
  2021-05-09  6:21   ` Christian Couder
  2021-05-09  8:26     ` Junio C Hamano
@ 2021-05-09 13:40     ` ZheNing Hu
  1 sibling, 0 replies; 27+ messages in thread
From: ZheNing Hu @ 2021-05-09 13:40 UTC (permalink / raw)
  To: Christian Couder
  Cc: ZheNing Hu via GitGitGadget, git, Junio C Hamano, Jeff King,
	Christian Couder, Hariom Verma

Christian Couder <christian.couder@gmail.com> 于2021年5月9日周日 下午2:21写道:
>
> > At the same time, The value of an atom_type is the coordinate
> > of its corresponding valid_atom entry, we can quickly index
> > to the corresponding valid_atom entry by the atom_type value.
>
> I am not sure it's worth having an atom_type field for each valid_atom
> element if the value of that field is already the index of the
> element, because then one would always be able to replace
> `valid_atom[i].atom_type` with just `i`. Or is it for some kind of
> type safety issue?
>

Well, I think the security issue here is just to allow used_atom and valid_atom
to be correctly mapped through atom_type. We don’t want the coder to forget to
update "enum atom_type" when adding new atoms to valid_atom in the future.
So maybe Junio's suggestion is reasonable, we delete the member atom_type in
valid_atom, but maintain the connection between atom_type and valid_atom items
by specifying atom_type as array coordinates.

> I wonder if the enum should be instead defined like this:
>
> enum atom_type {
> ATOM_UNKNOWN = 0,
> ATOM_REFNAME,
> ...
> ATOM_ELSE,
> ATOM_INVALID, /* should be last */
> };
>
> As a struct containing an atom_type would typically be initialized
> with 0 after being allocated, `ATOM_UNKNOWN = 0` could ensure that we
> can easily distinguish such a struct where the atom_type is known from
> such a struct where it is unknown yet.
>
> Having ATOM_INVALID as always the last enum value (even if some new
> ones are added later) could help us iterate over the valid atoms using
> something like:
>
> for (i = ATOM_UNKNOWN + 1; i < ATOM_INVALID; i++)
>         /* do something with valid_atom[i] */;
>

Thanks, this suggestion is good!

> > +
> >  /*
> >   * An atom is a valid field atom listed below, possibly prefixed with
> >   * a "*" to denote deref_tag().
> > @@ -122,6 +166,7 @@ static struct used_atom {
> >         const char *name;
> >         cmp_type type;
> >         info_source source;
> > +       enum atom_type atom_type;
> >         union {
> >                 char color[COLOR_MAXLEN];
> >                 struct align align;
> > @@ -500,53 +545,54 @@ static int head_atom_parser(const struct ref_format *format, struct used_atom *a
> >  }
> >
> >  static struct {
> > +       enum atom_type atom_type;
> >         const char *name;
> >         info_source source;
> >         cmp_type cmp_type;
>
> I can see that the fields are already not in the same order as in
> struct used_atom, but my opinion is that it would be better if they
> would we as much as possible in the same order. Maybe one day we could
> even unify these structs in some way.
>

Yes, atom_value, valid_atom, used_atom, It may be difficult to read for the
first time. Maybe unifying them is a good direction for the future.

> Also as discussed above we might not even need to add an atom_type to
> valid_atom[].
>

OK.

> >         int (*parser)(const struct ref_format *format, struct used_atom *atom,
> >                       const char *arg, struct strbuf *err);
> >  } valid_atom[] = {
>
> > @@ -628,6 +674,7 @@ static int parse_ref_filter_atom(const struct ref_format *format,
> >         at = used_atom_cnt;
> >         used_atom_cnt++;
> >         REALLOC_ARRAY(used_atom, used_atom_cnt);
> > +       used_atom[at].atom_type = valid_atom[i].atom_type;
>
> As discussed above, if the value of an atom_type is the coordinate of
> its corresponding valid_atom entry, then here the following would be
> simpler:
>
>        used_atom[at].atom_type = i;
>

I agree.

> >         used_atom[at].name = xmemdupz(atom, ep - atom);
> >         used_atom[at].type = valid_atom[i].cmp_type;
> >         used_atom[at].source = valid_atom[i].source;
> > @@ -652,7 +699,7 @@ static int parse_ref_filter_atom(const struct ref_format *format,
> >                 return -1;
> >         if (*atom == '*')
> >                 need_tagged = 1;
> > -       if (!strcmp(valid_atom[i].name, "symref"))
> > +       if (valid_atom[i].atom_type == ATOM_SYMREF)
>
> In the same way as above, the above line might be replaced with the simpler:
>
>        if (i == ATOM_SYMREF)
>
> >                 need_symref = 1;
> >         return at;
> >  }

Thanks!
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] [GSOC][RFC] ref-filter: introduce enum atom_type
  2021-05-09  8:26     ` Junio C Hamano
@ 2021-05-09 13:44       ` ZheNing Hu
  0 siblings, 0 replies; 27+ messages in thread
From: ZheNing Hu @ 2021-05-09 13:44 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Christian Couder, ZheNing Hu via GitGitGadget, git, Jeff King,
	Christian Couder, Hariom Verma

Junio C Hamano <gitster@pobox.com> 于2021年5月9日周日 下午4:26写道:
>
> Christian Couder <christian.couder@gmail.com> writes:
>
> > I am not sure it's worth having an atom_type field for each valid_atom
> > element if the value of that field is already the index of the
> > element, because then one would always be able to replace
> > `valid_atom[i].atom_type` with just `i`. Or is it for some kind of
> > type safety issue?
>
> > I wonder if the enum should be instead defined like this:
> >
> > enum atom_type {
> > ATOM_UNKNOWN = 0,
> > ATOM_REFNAME,
> > ...
> > ATOM_ELSE,
> > ATOM_INVALID, /* should be last */
> > };
> >
> > As a struct containing an atom_type would typically be initialized
> > with 0 after being allocated, `ATOM_UNKNOWN = 0` could ensure that we
> > can easily distinguish such a struct where the atom_type is known from
> > such a struct where it is unknown yet.
> >
> > Having ATOM_INVALID as always the last enum value (even if some new
> > ones are added later) could help us iterate over the valid atoms using
> > something like:
> >
> > for (i = ATOM_UNKNOWN + 1; i < ATOM_INVALID; i++)
> >         /* do something with valid_atom[i] */;
>
> OK.
>
> As to "safety", I think it still makes sense to declare "enum", but
> I agree that we do not necessarily have to have it in the valid_atom[]
> struct.  We could do something like this instead:
>
>     static struct {
>             const char *name;
>             info_source source;
>             cmp_type cmp_type;
>             int (*parser)(const struct ref_format *format, struct used_atom *atom,
>                           const char *arg, struct strbuf *err);
>     } valid_atom[] = {
>            [ATOM_REFNAME] = { "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser },
>            [ATOM_OBJECTTYPE] = { "objecttype", SOURCE_OTHER, FIELD_STR, objecttype...
>            [ATOM_OBJECTSIZE] = { "objectsize", SOURCE_OTHER, FIELD_ULONG, objectsi...
>            ...

Thank! Good suggection. We hope that the atom_type and
valid_atom items establish a clear connection. Maybe we
should add some comments before the definition of the
`enum atom_type` to remind the coder of the connection
between atom_type and valid_atom.

--
ZheNing Hu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 0/2] [GSOC][RFC] ref-filter: introduce enum atom_type
  2021-05-08 15:22 [PATCH 0/2] [GSOC][RFC] ref-filter: introduce enum atom_type ZheNing Hu via GitGitGadget
  2021-05-08 15:22 ` [PATCH 1/2] [GSOC] ref-filter: add objectsize to used_atom ZheNing Hu via GitGitGadget
  2021-05-08 15:22 ` [PATCH 2/2] [GSOC][RFC] ref-filter: introduce enum atom_type ZheNing Hu via GitGitGadget
@ 2021-05-10 15:03 ` ZheNing Hu via GitGitGadget
  2021-05-10 15:03   ` [PATCH v2 1/2] [GSOC] ref-filter: add objectsize to used_atom ZheNing Hu via GitGitGadget
                     ` (2 more replies)
  2 siblings, 3 replies; 27+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-05-10 15:03 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Hariom Verma, Christian Couder, ZheNing Hu

Change from last version:

 * Remove the atom_type member from valid_atom at the suggestion of
   Christian and Junio, and use ATOM_* as the coordinates of valid_atom.
 * Add ATOM_INVALID and ATOM_UNKNOWN under Christian's suggestion, we can
   traverse valid_atom through them.
 * Add a note about enum atom_type to remind readers of its role.

ZheNing Hu (2):
  [GSOC] ref-filter: add objectsize to used_atom
  [GSOC] ref-filter: introduce enum atom_type

 ref-filter.c | 231 ++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 155 insertions(+), 76 deletions(-)


base-commit: 7e391989789db82983665667013a46eabc6fc570
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-951%2Fadlternative%2Fref-filter-atom-type-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-951/adlternative/ref-filter-atom-type-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/951

Range-diff vs v1:

 1:  91ca57c9d04a = 1:  91ca57c9d04a [GSOC] ref-filter: add objectsize to used_atom
 2:  3770df182983 ! 2:  a1f70b39b7ef [GSOC][RFC] ref-filter: introduce enum atom_type
     @@ Metadata
      Author: ZheNing Hu <adlternative@gmail.com>
      
       ## Commit message ##
     -    [GSOC][RFC] ref-filter: introduce enum atom_type
     +    [GSOC] ref-filter: introduce enum atom_type
      
          In the original ref-filter design, it will copy the parsed
          atom's name and attributes to `used_atom[i].name` in the
     @@ Commit message
          in the later specific ref attributes filling step. It use
          a lot of string matching to determine which atom we need.
      
     -    Introduce the enum member `valid_atom.atom_type` which
     -    record type of each valid_atom, in the first step of the
     -    atom parsing, `used_atom.atom_type` will record corresponding
     -    enum value from `valid_atom.atom_type`, and then in specific
     -    reference attribute filling step, only need to compare the
     -    value of the `used_atom.atom_type` to judge the atom type.
     +    Introduce the enum "atom_type", each enum value is named
     +    as `ATOM_*`, which is the index of each corresponding
     +    valid_atom entry. In the first step of the atom parsing,
     +    `used_atom.atom_type` will record corresponding enum value
     +    from valid_atom entry index, and then in specific reference
     +    attribute filling step, only need to compare the value of
     +    the `used_atom.atom_type` to judge the atom type.
      
     -    At the same time, The value of an atom_type is the coordinate
     -    of its corresponding valid_atom entry, we can quickly index
     -    to the corresponding valid_atom entry by the atom_type value.
     +    the enum value of `ATOM_UNKNOWN` is equals to zero, which
     +    could ensure that we can easily distinguish such a struct
     +    where the atom_type is known from such a struct where it
     +    is unknown yet.
     +
     +    the enum value of `ATOM_INVALID` is equals to the size of
     +    valid_atom array, which could help us iterate over
     +    valid_atom array using something like:
     +
     +    for (i = ATOM_UNKNOWN + 1; i < ATOM_INVALID; i++)
     +            /* do something with valid_atom[i] */;
      
          Signed-off-by: ZheNing Hu <adlternative@gmail.com>
      
     @@ ref-filter.c: static struct ref_to_worktree_map {
       	struct worktree **worktrees;
       } ref_to_worktree_map;
       
     ++/*
     ++ * The enum atom_type is used as the coordinates of valid_atom entry.
     ++ * In the atom parsing stage, it will be passed to used_atom.atom_type
     ++ * as the identifier of the atom type. We can judge the type of used_atom
     ++ * entry by `if (used_atom[i].atom_type == ATOM_*)`.
     ++ *
     ++ * ATOM_UNKNOWN equals to 0, used as an enumeration value of uninitialized
     ++ * atom_type.
     ++ * ATOM_INVALID equals to the size of valid_atom array, which could help us
     ++ * iterate over valid_atom array like this:
     ++ *
     ++ * 	for (i = ATOM_UNKNOWN + 1; i < ATOM_INVALID; i++) {
     ++ *		int len = strlen(valid_atom[i].name);
     ++ *		if (len == atom_len && !memcmp(valid_atom[i].name, sp, len))
     ++ *			break;
     ++ *	}
     ++ */
      +enum atom_type {
     ++ATOM_UNKNOWN,
      +ATOM_REFNAME,
      +ATOM_OBJECTTYPE,
      +ATOM_OBJECTSIZE,
     @@ ref-filter.c: static struct ref_to_worktree_map {
      +ATOM_IF,
      +ATOM_THEN,
      +ATOM_ELSE,
     ++ATOM_INVALID,
      +};
      +
       /*
        * An atom is a valid field atom listed below, possibly prefixed with
        * a "*" to denote deref_tag().
     -@@ ref-filter.c: static struct used_atom {
     - 	const char *name;
     - 	cmp_type type;
     - 	info_source source;
     -+	enum atom_type atom_type;
     - 	union {
     - 		char color[COLOR_MAXLEN];
     - 		struct align align;
     -@@ ref-filter.c: static int head_atom_parser(const struct ref_format *format, struct used_atom *a
     - }
     - 
     - static struct {
     +@@ ref-filter.c: static struct ref_to_worktree_map {
     +  * array.
     +  */
     + static struct used_atom {
      +	enum atom_type atom_type;
       	const char *name;
     + 	cmp_type type;
       	info_source source;
     - 	cmp_type cmp_type;
     +@@ ref-filter.c: static struct {
       	int (*parser)(const struct ref_format *format, struct used_atom *atom,
       		      const char *arg, struct strbuf *err);
       } valid_atom[] = {
     @@ ref-filter.c: static int head_atom_parser(const struct ref_format *format, struc
      -	{ "if", SOURCE_NONE, FIELD_STR, if_atom_parser },
      -	{ "then", SOURCE_NONE },
      -	{ "else", SOURCE_NONE },
     -+	{ ATOM_REFNAME, "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser },
     -+	{ ATOM_OBJECTTYPE, "objecttype", SOURCE_OTHER, FIELD_STR, objecttype_atom_parser },
     -+	{ ATOM_OBJECTSIZE, "objectsize", SOURCE_OTHER, FIELD_ULONG, objectsize_atom_parser },
     -+	{ ATOM_OBJECTNAME, "objectname", SOURCE_OTHER, FIELD_STR, oid_atom_parser },
     -+	{ ATOM_DELTABASE, "deltabase", SOURCE_OTHER, FIELD_STR, deltabase_atom_parser },
     -+	{ ATOM_TREE, "tree", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
     -+	{ ATOM_PARENT, "parent", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
     -+	{ ATOM_NUMPARENT, "numparent", SOURCE_OBJ, FIELD_ULONG },
     -+	{ ATOM_OBJECT, "object", SOURCE_OBJ },
     -+	{ ATOM_TYPE, "type", SOURCE_OBJ },
     -+	{ ATOM_TAG, "tag", SOURCE_OBJ },
     -+	{ ATOM_AUTHOR, "author", SOURCE_OBJ },
     -+	{ ATOM_AUTHORNAME, "authorname", SOURCE_OBJ },
     -+	{ ATOM_AUTHOREMAIL, "authoremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
     -+	{ ATOM_AUTHORDATE, "authordate", SOURCE_OBJ, FIELD_TIME },
     -+	{ ATOM_COMMITTER, "committer", SOURCE_OBJ },
     -+	{ ATOM_COMMITTERNAME, "committername", SOURCE_OBJ },
     -+	{ ATOM_COMMITTEREMAIL, "committeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
     -+	{ ATOM_COMMITTERDATE, "committerdate", SOURCE_OBJ, FIELD_TIME },
     -+	{ ATOM_TAGGER, "tagger", SOURCE_OBJ },
     -+	{ ATOM_TAGGERNAME, "taggername", SOURCE_OBJ },
     -+	{ ATOM_TAGGEREMAIL, "taggeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
     -+	{ ATOM_TAGGERDATE, "taggerdate", SOURCE_OBJ, FIELD_TIME },
     -+	{ ATOM_CREATOR, "creator", SOURCE_OBJ },
     -+	{ ATOM_CREATORDATE, "creatordate", SOURCE_OBJ, FIELD_TIME },
     -+	{ ATOM_SUBJECT, "subject", SOURCE_OBJ, FIELD_STR, subject_atom_parser },
     -+	{ ATOM_BODY, "body", SOURCE_OBJ, FIELD_STR, body_atom_parser },
     -+	{ ATOM_TRAILERS, "trailers", SOURCE_OBJ, FIELD_STR, trailers_atom_parser },
     -+	{ ATOM_CONTENTS, "contents", SOURCE_OBJ, FIELD_STR, contents_atom_parser },
     -+	{ ATOM_UPSTREAM, "upstream", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
     -+	{ ATOM_PUSH, "push", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
     -+	{ ATOM_SYMREF, "symref", SOURCE_NONE, FIELD_STR, refname_atom_parser },
     -+	{ ATOM_FLAG, "flag", SOURCE_NONE },
     -+	{ ATOM_HEAD, "HEAD", SOURCE_NONE, FIELD_STR, head_atom_parser },
     -+	{ ATOM_COLOR, "color", SOURCE_NONE, FIELD_STR, color_atom_parser },
     -+	{ ATOM_WORKTREEPATH, "worktreepath", SOURCE_NONE },
     -+	{ ATOM_ALIGN, "align", SOURCE_NONE, FIELD_STR, align_atom_parser },
     -+	{ ATOM_END, "end", SOURCE_NONE },
     -+	{ ATOM_IF, "if", SOURCE_NONE, FIELD_STR, if_atom_parser },
     -+	{ ATOM_THEN, "then", SOURCE_NONE },
     -+	{ ATOM_ELSE, "else", SOURCE_NONE },
     ++	[ATOM_REFNAME] = { "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser },
     ++	[ATOM_OBJECTTYPE] = { "objecttype", SOURCE_OTHER, FIELD_STR, objecttype_atom_parser },
     ++	[ATOM_OBJECTSIZE] = { "objectsize", SOURCE_OTHER, FIELD_ULONG, objectsize_atom_parser },
     ++	[ATOM_OBJECTNAME] = { "objectname", SOURCE_OTHER, FIELD_STR, oid_atom_parser },
     ++	[ATOM_DELTABASE] = { "deltabase", SOURCE_OTHER, FIELD_STR, deltabase_atom_parser },
     ++	[ATOM_TREE] = { "tree", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
     ++	[ATOM_PARENT] = { "parent", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
     ++	[ATOM_NUMPARENT] = { "numparent", SOURCE_OBJ, FIELD_ULONG },
     ++	[ATOM_OBJECT] = { "object", SOURCE_OBJ },
     ++	[ATOM_TYPE] = { "type", SOURCE_OBJ },
     ++	[ATOM_TAG] = { "tag", SOURCE_OBJ },
     ++	[ATOM_AUTHOR] = { "author", SOURCE_OBJ },
     ++	[ATOM_AUTHORNAME] = { "authorname", SOURCE_OBJ },
     ++	[ATOM_AUTHOREMAIL] = { "authoremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
     ++	[ATOM_AUTHORDATE] = { "authordate", SOURCE_OBJ, FIELD_TIME },
     ++	[ATOM_COMMITTER] = { "committer", SOURCE_OBJ },
     ++	[ATOM_COMMITTERNAME] = { "committername", SOURCE_OBJ },
     ++	[ATOM_COMMITTEREMAIL] = { "committeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
     ++	[ATOM_COMMITTERDATE] = { "committerdate", SOURCE_OBJ, FIELD_TIME },
     ++	[ATOM_TAGGER] = { "tagger", SOURCE_OBJ },
     ++	[ATOM_TAGGERNAME] = { "taggername", SOURCE_OBJ },
     ++	[ATOM_TAGGEREMAIL] = { "taggeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
     ++	[ATOM_TAGGERDATE] = { "taggerdate", SOURCE_OBJ, FIELD_TIME },
     ++	[ATOM_CREATOR] = { "creator", SOURCE_OBJ },
     ++	[ATOM_CREATORDATE] = { "creatordate", SOURCE_OBJ, FIELD_TIME },
     ++	[ATOM_SUBJECT] = { "subject", SOURCE_OBJ, FIELD_STR, subject_atom_parser },
     ++	[ATOM_BODY] = { "body", SOURCE_OBJ, FIELD_STR, body_atom_parser },
     ++	[ATOM_TRAILERS] = { "trailers", SOURCE_OBJ, FIELD_STR, trailers_atom_parser },
     ++	[ATOM_CONTENTS] = { "contents", SOURCE_OBJ, FIELD_STR, contents_atom_parser },
     ++	[ATOM_UPSTREAM] = { "upstream", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
     ++	[ATOM_PUSH] = { "push", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
     ++	[ATOM_SYMREF] = { "symref", SOURCE_NONE, FIELD_STR, refname_atom_parser },
     ++	[ATOM_FLAG] = { "flag", SOURCE_NONE },
     ++	[ATOM_HEAD] = { "HEAD", SOURCE_NONE, FIELD_STR, head_atom_parser },
     ++	[ATOM_COLOR] = { "color", SOURCE_NONE, FIELD_STR, color_atom_parser },
     ++	[ATOM_WORKTREEPATH] = { "worktreepath", SOURCE_NONE },
     ++	[ATOM_ALIGN] = { "align", SOURCE_NONE, FIELD_STR, align_atom_parser },
     ++	[ATOM_END] = { "end", SOURCE_NONE },
     ++	[ATOM_IF] = { "if", SOURCE_NONE, FIELD_STR, if_atom_parser },
     ++	[ATOM_THEN] = { "then", SOURCE_NONE },
     ++	[ATOM_ELSE] = { "else", SOURCE_NONE },
       	/*
       	 * Please update $__git_ref_fieldlist in git-completion.bash
       	 * when you add new atoms
     +@@ ref-filter.c: static int parse_ref_filter_atom(const struct ref_format *format,
     + 	atom_len = (arg ? arg : ep) - sp;
     + 
     + 	/* Is the atom a valid one? */
     +-	for (i = 0; i < ARRAY_SIZE(valid_atom); i++) {
     ++	for (i = ATOM_UNKNOWN + 1; i < ATOM_INVALID; i++) {
     + 		int len = strlen(valid_atom[i].name);
     + 		if (len == atom_len && !memcmp(valid_atom[i].name, sp, len))
     + 			break;
     + 	}
     + 
     +-	if (ARRAY_SIZE(valid_atom) <= i)
     ++	if (i == ATOM_INVALID)
     + 		return strbuf_addf_ret(err, -1, _("unknown field name: %.*s"),
     + 				       (int)(ep-atom), atom);
     + 	if (valid_atom[i].source != SOURCE_NONE && !have_git_dir())
      @@ ref-filter.c: static int parse_ref_filter_atom(const struct ref_format *format,
       	at = used_atom_cnt;
       	used_atom_cnt++;
       	REALLOC_ARRAY(used_atom, used_atom_cnt);
     -+	used_atom[at].atom_type = valid_atom[i].atom_type;
     ++	used_atom[at].atom_type = i;
       	used_atom[at].name = xmemdupz(atom, ep - atom);
       	used_atom[at].type = valid_atom[i].cmp_type;
       	used_atom[at].source = valid_atom[i].source;
     @@ ref-filter.c: static int parse_ref_filter_atom(const struct ref_format *format,
       	if (*atom == '*')
       		need_tagged = 1;
      -	if (!strcmp(valid_atom[i].name, "symref"))
     -+	if (valid_atom[i].atom_type == ATOM_SYMREF)
     ++	if (i == ATOM_SYMREF)
       		need_symref = 1;
       	return at;
       }

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 1/2] [GSOC] ref-filter: add objectsize to used_atom
  2021-05-10 15:03 ` [PATCH v2 0/2] " ZheNing Hu via GitGitGadget
@ 2021-05-10 15:03   ` ZheNing Hu via GitGitGadget
  2021-05-10 15:03   ` [PATCH v2 2/2] [GSOC] ref-filter: introduce enum atom_type ZheNing Hu via GitGitGadget
  2021-05-12 12:11   ` [PATCH v3 0/2] [GSOC][RFC] " ZheNing Hu via GitGitGadget
  2 siblings, 0 replies; 27+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-05-10 15:03 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Hariom Verma, Christian Couder,
	ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Since "objectsize:size" is composed of two parts,
"type:attribute". However, the original implementation
did not decouple the two parts "type" and "attribute" well,
we still need to judge separately whether the atom is
"objectsize" or "objectsize:disk" in `grab_common_values()`.

Add a new member `objectsize` to the union `used_atom.u`,
so that we can separate the judgment of atom type from the
judgment of atom attribute, This will bring scalability to
atom `%(objectsize)`.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 ref-filter.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index a0adb4551d87..f420bae6e5ba 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -146,6 +146,9 @@ static struct used_atom {
 			enum { O_FULL, O_LENGTH, O_SHORT } option;
 			unsigned int length;
 		} oid;
+		struct {
+			enum { O_SIZE, O_SIZE_DISK } option;
+		} objectsize;
 		struct email_option {
 			enum { EO_RAW, EO_TRIM, EO_LOCALPART } option;
 		} email_option;
@@ -269,11 +272,13 @@ static int objectsize_atom_parser(const struct ref_format *format, struct used_a
 				  const char *arg, struct strbuf *err)
 {
 	if (!arg) {
+		atom->u.objectsize.option = O_SIZE;
 		if (*atom->name == '*')
 			oi_deref.info.sizep = &oi_deref.size;
 		else
 			oi.info.sizep = &oi.size;
 	} else if (!strcmp(arg, "disk")) {
+		atom->u.objectsize.option = O_SIZE_DISK;
 		if (*atom->name == '*')
 			oi_deref.info.disk_sizep = &oi_deref.disk_size;
 		else
@@ -967,12 +972,14 @@ static void grab_common_values(struct atom_value *val, int deref, struct expand_
 			name++;
 		if (!strcmp(name, "objecttype"))
 			v->s = xstrdup(type_name(oi->type));
-		else if (!strcmp(name, "objectsize:disk")) {
-			v->value = oi->disk_size;
-			v->s = xstrfmt("%"PRIuMAX, (uintmax_t)oi->disk_size);
-		} else if (!strcmp(name, "objectsize")) {
-			v->value = oi->size;
-			v->s = xstrfmt("%"PRIuMAX , (uintmax_t)oi->size);
+		else if (starts_with(name, "objectsize")) {
+			if (used_atom[i].u.objectsize.option == O_SIZE_DISK) {
+				v->value = oi->disk_size;
+				v->s = xstrfmt("%"PRIuMAX, (uintmax_t)oi->disk_size);
+			} else if (used_atom[i].u.objectsize.option == O_SIZE) {
+				v->value = oi->size;
+				v->s = xstrfmt("%"PRIuMAX , (uintmax_t)oi->size);
+			}
 		} else if (!strcmp(name, "deltabase"))
 			v->s = xstrdup(oid_to_hex(&oi->delta_base_oid));
 		else if (deref)
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 2/2] [GSOC] ref-filter: introduce enum atom_type
  2021-05-10 15:03 ` [PATCH v2 0/2] " ZheNing Hu via GitGitGadget
  2021-05-10 15:03   ` [PATCH v2 1/2] [GSOC] ref-filter: add objectsize to used_atom ZheNing Hu via GitGitGadget
@ 2021-05-10 15:03   ` ZheNing Hu via GitGitGadget
  2021-05-11  2:14     ` Junio C Hamano
  2021-05-12 12:11   ` [PATCH v3 0/2] [GSOC][RFC] " ZheNing Hu via GitGitGadget
  2 siblings, 1 reply; 27+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-05-10 15:03 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Hariom Verma, Christian Couder,
	ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

In the original ref-filter design, it will copy the parsed
atom's name and attributes to `used_atom[i].name` in the
atom's parsing step, and use it again for string matching
in the later specific ref attributes filling step. It use
a lot of string matching to determine which atom we need.

Introduce the enum "atom_type", each enum value is named
as `ATOM_*`, which is the index of each corresponding
valid_atom entry. In the first step of the atom parsing,
`used_atom.atom_type` will record corresponding enum value
from valid_atom entry index, and then in specific reference
attribute filling step, only need to compare the value of
the `used_atom.atom_type` to judge the atom type.

the enum value of `ATOM_UNKNOWN` is equals to zero, which
could ensure that we can easily distinguish such a struct
where the atom_type is known from such a struct where it
is unknown yet.

the enum value of `ATOM_INVALID` is equals to the size of
valid_atom array, which could help us iterate over
valid_atom array using something like:

for (i = ATOM_UNKNOWN + 1; i < ATOM_INVALID; i++)
        /* do something with valid_atom[i] */;

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 ref-filter.c | 214 ++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 143 insertions(+), 71 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index f420bae6e5ba..e3bf2cd1aaec 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -108,6 +108,69 @@ static struct ref_to_worktree_map {
 	struct worktree **worktrees;
 } ref_to_worktree_map;
 
+/*
+ * The enum atom_type is used as the coordinates of valid_atom entry.
+ * In the atom parsing stage, it will be passed to used_atom.atom_type
+ * as the identifier of the atom type. We can judge the type of used_atom
+ * entry by `if (used_atom[i].atom_type == ATOM_*)`.
+ *
+ * ATOM_UNKNOWN equals to 0, used as an enumeration value of uninitialized
+ * atom_type.
+ * ATOM_INVALID equals to the size of valid_atom array, which could help us
+ * iterate over valid_atom array like this:
+ *
+ * 	for (i = ATOM_UNKNOWN + 1; i < ATOM_INVALID; i++) {
+ *		int len = strlen(valid_atom[i].name);
+ *		if (len == atom_len && !memcmp(valid_atom[i].name, sp, len))
+ *			break;
+ *	}
+ */
+enum atom_type {
+ATOM_UNKNOWN,
+ATOM_REFNAME,
+ATOM_OBJECTTYPE,
+ATOM_OBJECTSIZE,
+ATOM_OBJECTNAME,
+ATOM_DELTABASE,
+ATOM_TREE,
+ATOM_PARENT,
+ATOM_NUMPARENT,
+ATOM_OBJECT,
+ATOM_TYPE,
+ATOM_TAG,
+ATOM_AUTHOR,
+ATOM_AUTHORNAME,
+ATOM_AUTHOREMAIL,
+ATOM_AUTHORDATE,
+ATOM_COMMITTER,
+ATOM_COMMITTERNAME,
+ATOM_COMMITTEREMAIL,
+ATOM_COMMITTERDATE,
+ATOM_TAGGER,
+ATOM_TAGGERNAME,
+ATOM_TAGGEREMAIL,
+ATOM_TAGGERDATE,
+ATOM_CREATOR,
+ATOM_CREATORDATE,
+ATOM_SUBJECT,
+ATOM_BODY,
+ATOM_TRAILERS,
+ATOM_CONTENTS,
+ATOM_UPSTREAM,
+ATOM_PUSH,
+ATOM_SYMREF,
+ATOM_FLAG,
+ATOM_HEAD,
+ATOM_COLOR,
+ATOM_WORKTREEPATH,
+ATOM_ALIGN,
+ATOM_END,
+ATOM_IF,
+ATOM_THEN,
+ATOM_ELSE,
+ATOM_INVALID,
+};
+
 /*
  * An atom is a valid field atom listed below, possibly prefixed with
  * a "*" to denote deref_tag().
@@ -119,6 +182,7 @@ static struct ref_to_worktree_map {
  * array.
  */
 static struct used_atom {
+	enum atom_type atom_type;
 	const char *name;
 	cmp_type type;
 	info_source source;
@@ -506,47 +570,47 @@ static struct {
 	int (*parser)(const struct ref_format *format, struct used_atom *atom,
 		      const char *arg, struct strbuf *err);
 } valid_atom[] = {
-	{ "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser },
-	{ "objecttype", SOURCE_OTHER, FIELD_STR, objecttype_atom_parser },
-	{ "objectsize", SOURCE_OTHER, FIELD_ULONG, objectsize_atom_parser },
-	{ "objectname", SOURCE_OTHER, FIELD_STR, oid_atom_parser },
-	{ "deltabase", SOURCE_OTHER, FIELD_STR, deltabase_atom_parser },
-	{ "tree", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
-	{ "parent", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
-	{ "numparent", SOURCE_OBJ, FIELD_ULONG },
-	{ "object", SOURCE_OBJ },
-	{ "type", SOURCE_OBJ },
-	{ "tag", SOURCE_OBJ },
-	{ "author", SOURCE_OBJ },
-	{ "authorname", SOURCE_OBJ },
-	{ "authoremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
-	{ "authordate", SOURCE_OBJ, FIELD_TIME },
-	{ "committer", SOURCE_OBJ },
-	{ "committername", SOURCE_OBJ },
-	{ "committeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
-	{ "committerdate", SOURCE_OBJ, FIELD_TIME },
-	{ "tagger", SOURCE_OBJ },
-	{ "taggername", SOURCE_OBJ },
-	{ "taggeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
-	{ "taggerdate", SOURCE_OBJ, FIELD_TIME },
-	{ "creator", SOURCE_OBJ },
-	{ "creatordate", SOURCE_OBJ, FIELD_TIME },
-	{ "subject", SOURCE_OBJ, FIELD_STR, subject_atom_parser },
-	{ "body", SOURCE_OBJ, FIELD_STR, body_atom_parser },
-	{ "trailers", SOURCE_OBJ, FIELD_STR, trailers_atom_parser },
-	{ "contents", SOURCE_OBJ, FIELD_STR, contents_atom_parser },
-	{ "upstream", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
-	{ "push", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
-	{ "symref", SOURCE_NONE, FIELD_STR, refname_atom_parser },
-	{ "flag", SOURCE_NONE },
-	{ "HEAD", SOURCE_NONE, FIELD_STR, head_atom_parser },
-	{ "color", SOURCE_NONE, FIELD_STR, color_atom_parser },
-	{ "worktreepath", SOURCE_NONE },
-	{ "align", SOURCE_NONE, FIELD_STR, align_atom_parser },
-	{ "end", SOURCE_NONE },
-	{ "if", SOURCE_NONE, FIELD_STR, if_atom_parser },
-	{ "then", SOURCE_NONE },
-	{ "else", SOURCE_NONE },
+	[ATOM_REFNAME] = { "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser },
+	[ATOM_OBJECTTYPE] = { "objecttype", SOURCE_OTHER, FIELD_STR, objecttype_atom_parser },
+	[ATOM_OBJECTSIZE] = { "objectsize", SOURCE_OTHER, FIELD_ULONG, objectsize_atom_parser },
+	[ATOM_OBJECTNAME] = { "objectname", SOURCE_OTHER, FIELD_STR, oid_atom_parser },
+	[ATOM_DELTABASE] = { "deltabase", SOURCE_OTHER, FIELD_STR, deltabase_atom_parser },
+	[ATOM_TREE] = { "tree", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
+	[ATOM_PARENT] = { "parent", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
+	[ATOM_NUMPARENT] = { "numparent", SOURCE_OBJ, FIELD_ULONG },
+	[ATOM_OBJECT] = { "object", SOURCE_OBJ },
+	[ATOM_TYPE] = { "type", SOURCE_OBJ },
+	[ATOM_TAG] = { "tag", SOURCE_OBJ },
+	[ATOM_AUTHOR] = { "author", SOURCE_OBJ },
+	[ATOM_AUTHORNAME] = { "authorname", SOURCE_OBJ },
+	[ATOM_AUTHOREMAIL] = { "authoremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
+	[ATOM_AUTHORDATE] = { "authordate", SOURCE_OBJ, FIELD_TIME },
+	[ATOM_COMMITTER] = { "committer", SOURCE_OBJ },
+	[ATOM_COMMITTERNAME] = { "committername", SOURCE_OBJ },
+	[ATOM_COMMITTEREMAIL] = { "committeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
+	[ATOM_COMMITTERDATE] = { "committerdate", SOURCE_OBJ, FIELD_TIME },
+	[ATOM_TAGGER] = { "tagger", SOURCE_OBJ },
+	[ATOM_TAGGERNAME] = { "taggername", SOURCE_OBJ },
+	[ATOM_TAGGEREMAIL] = { "taggeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
+	[ATOM_TAGGERDATE] = { "taggerdate", SOURCE_OBJ, FIELD_TIME },
+	[ATOM_CREATOR] = { "creator", SOURCE_OBJ },
+	[ATOM_CREATORDATE] = { "creatordate", SOURCE_OBJ, FIELD_TIME },
+	[ATOM_SUBJECT] = { "subject", SOURCE_OBJ, FIELD_STR, subject_atom_parser },
+	[ATOM_BODY] = { "body", SOURCE_OBJ, FIELD_STR, body_atom_parser },
+	[ATOM_TRAILERS] = { "trailers", SOURCE_OBJ, FIELD_STR, trailers_atom_parser },
+	[ATOM_CONTENTS] = { "contents", SOURCE_OBJ, FIELD_STR, contents_atom_parser },
+	[ATOM_UPSTREAM] = { "upstream", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
+	[ATOM_PUSH] = { "push", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
+	[ATOM_SYMREF] = { "symref", SOURCE_NONE, FIELD_STR, refname_atom_parser },
+	[ATOM_FLAG] = { "flag", SOURCE_NONE },
+	[ATOM_HEAD] = { "HEAD", SOURCE_NONE, FIELD_STR, head_atom_parser },
+	[ATOM_COLOR] = { "color", SOURCE_NONE, FIELD_STR, color_atom_parser },
+	[ATOM_WORKTREEPATH] = { "worktreepath", SOURCE_NONE },
+	[ATOM_ALIGN] = { "align", SOURCE_NONE, FIELD_STR, align_atom_parser },
+	[ATOM_END] = { "end", SOURCE_NONE },
+	[ATOM_IF] = { "if", SOURCE_NONE, FIELD_STR, if_atom_parser },
+	[ATOM_THEN] = { "then", SOURCE_NONE },
+	[ATOM_ELSE] = { "else", SOURCE_NONE },
 	/*
 	 * Please update $__git_ref_fieldlist in git-completion.bash
 	 * when you add new atoms
@@ -610,13 +674,13 @@ static int parse_ref_filter_atom(const struct ref_format *format,
 	atom_len = (arg ? arg : ep) - sp;
 
 	/* Is the atom a valid one? */
-	for (i = 0; i < ARRAY_SIZE(valid_atom); i++) {
+	for (i = ATOM_UNKNOWN + 1; i < ATOM_INVALID; i++) {
 		int len = strlen(valid_atom[i].name);
 		if (len == atom_len && !memcmp(valid_atom[i].name, sp, len))
 			break;
 	}
 
-	if (ARRAY_SIZE(valid_atom) <= i)
+	if (i == ATOM_INVALID)
 		return strbuf_addf_ret(err, -1, _("unknown field name: %.*s"),
 				       (int)(ep-atom), atom);
 	if (valid_atom[i].source != SOURCE_NONE && !have_git_dir())
@@ -628,6 +692,7 @@ static int parse_ref_filter_atom(const struct ref_format *format,
 	at = used_atom_cnt;
 	used_atom_cnt++;
 	REALLOC_ARRAY(used_atom, used_atom_cnt);
+	used_atom[at].atom_type = i;
 	used_atom[at].name = xmemdupz(atom, ep - atom);
 	used_atom[at].type = valid_atom[i].cmp_type;
 	used_atom[at].source = valid_atom[i].source;
@@ -652,7 +717,7 @@ static int parse_ref_filter_atom(const struct ref_format *format,
 		return -1;
 	if (*atom == '*')
 		need_tagged = 1;
-	if (!strcmp(valid_atom[i].name, "symref"))
+	if (i == ATOM_SYMREF)
 		need_symref = 1;
 	return at;
 }
@@ -965,14 +1030,15 @@ static void grab_common_values(struct atom_value *val, int deref, struct expand_
 
 	for (i = 0; i < used_atom_cnt; i++) {
 		const char *name = used_atom[i].name;
+		enum atom_type atom_type = used_atom[i].atom_type;
 		struct atom_value *v = &val[i];
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
-		if (!strcmp(name, "objecttype"))
+		if (atom_type == ATOM_OBJECTTYPE)
 			v->s = xstrdup(type_name(oi->type));
-		else if (starts_with(name, "objectsize")) {
+		else if (atom_type == ATOM_OBJECTSIZE) {
 			if (used_atom[i].u.objectsize.option == O_SIZE_DISK) {
 				v->value = oi->disk_size;
 				v->s = xstrfmt("%"PRIuMAX, (uintmax_t)oi->disk_size);
@@ -980,9 +1046,9 @@ static void grab_common_values(struct atom_value *val, int deref, struct expand_
 				v->value = oi->size;
 				v->s = xstrfmt("%"PRIuMAX , (uintmax_t)oi->size);
 			}
-		} else if (!strcmp(name, "deltabase"))
+		} else if (atom_type == ATOM_DELTABASE)
 			v->s = xstrdup(oid_to_hex(&oi->delta_base_oid));
-		else if (deref)
+		else if (atom_type == ATOM_OBJECTNAME && deref)
 			grab_oid(name, "objectname", &oi->oid, v, &used_atom[i]);
 	}
 }
@@ -995,16 +1061,17 @@ static void grab_tag_values(struct atom_value *val, int deref, struct object *ob
 
 	for (i = 0; i < used_atom_cnt; i++) {
 		const char *name = used_atom[i].name;
+		enum atom_type atom_type = used_atom[i].atom_type;
 		struct atom_value *v = &val[i];
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
-		if (!strcmp(name, "tag"))
+		if (atom_type == ATOM_TAG)
 			v->s = xstrdup(tag->tag);
-		else if (!strcmp(name, "type") && tag->tagged)
+		else if (atom_type == ATOM_TYPE && tag->tagged)
 			v->s = xstrdup(type_name(tag->tagged->type));
-		else if (!strcmp(name, "object") && tag->tagged)
+		else if (atom_type == ATOM_OBJECT && tag->tagged)
 			v->s = xstrdup(oid_to_hex(&tag->tagged->oid));
 	}
 }
@@ -1017,18 +1084,20 @@ static void grab_commit_values(struct atom_value *val, int deref, struct object
 
 	for (i = 0; i < used_atom_cnt; i++) {
 		const char *name = used_atom[i].name;
+		enum atom_type atom_type = used_atom[i].atom_type;
 		struct atom_value *v = &val[i];
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
-		if (grab_oid(name, "tree", get_commit_tree_oid(commit), v, &used_atom[i]))
+		if (atom_type == ATOM_TREE &&
+		    grab_oid(name, "tree", get_commit_tree_oid(commit), v, &used_atom[i]))
 			continue;
-		if (!strcmp(name, "numparent")) {
+		if (atom_type == ATOM_NUMPARENT) {
 			v->value = commit_list_count(commit->parents);
 			v->s = xstrfmt("%lu", (unsigned long)v->value);
 		}
-		else if (starts_with(name, "parent")) {
+		else if (atom_type == ATOM_PARENT) {
 			struct commit_list *parents;
 			struct strbuf s = STRBUF_INIT;
 			for (parents = commit->parents; parents; parents = parents->next) {
@@ -1208,15 +1277,16 @@ static void grab_person(const char *who, struct atom_value *val, int deref, void
 		return;
 	for (i = 0; i < used_atom_cnt; i++) {
 		const char *name = used_atom[i].name;
+		enum atom_type atom_type = used_atom[i].atom_type;
 		struct atom_value *v = &val[i];
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
 
-		if (starts_with(name, "creatordate"))
+		if (atom_type == ATOM_CREATORDATE)
 			grab_date(wholine, v, name);
-		else if (!strcmp(name, "creator"))
+		else if (atom_type == ATOM_CREATOR)
 			v->s = copy_line(wholine);
 	}
 }
@@ -1696,6 +1766,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 	/* Fill in specials first */
 	for (i = 0; i < used_atom_cnt; i++) {
 		struct used_atom *atom = &used_atom[i];
+		enum atom_type atom_type = atom->atom_type;
 		const char *name = used_atom[i].name;
 		struct atom_value *v = &ref->value[i];
 		int deref = 0;
@@ -1710,18 +1781,18 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 			name++;
 		}
 
-		if (starts_with(name, "refname"))
+		if (atom_type == ATOM_REFNAME)
 			refname = get_refname(atom, ref);
-		else if (!strcmp(name, "worktreepath")) {
+		else if (atom_type == ATOM_WORKTREEPATH) {
 			if (ref->kind == FILTER_REFS_BRANCHES)
 				v->s = get_worktree_path(atom, ref);
 			else
 				v->s = xstrdup("");
 			continue;
 		}
-		else if (starts_with(name, "symref"))
+		else if (atom_type == ATOM_SYMREF)
 			refname = get_symref(atom, ref);
-		else if (starts_with(name, "upstream")) {
+		else if (atom_type == ATOM_UPSTREAM) {
 			const char *branch_name;
 			/* only local branches may have an upstream */
 			if (!skip_prefix(ref->refname, "refs/heads/",
@@ -1737,7 +1808,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 			else
 				v->s = xstrdup("");
 			continue;
-		} else if (atom->u.remote_ref.push) {
+		} else if (atom_type == ATOM_PUSH && atom->u.remote_ref.push) {
 			const char *branch_name;
 			v->s = xstrdup("");
 			if (!skip_prefix(ref->refname, "refs/heads/",
@@ -1756,10 +1827,10 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 			free((char *)v->s);
 			fill_remote_ref_details(atom, refname, branch, &v->s);
 			continue;
-		} else if (starts_with(name, "color:")) {
+		} else if (atom_type == ATOM_COLOR) {
 			v->s = xstrdup(atom->u.color);
 			continue;
-		} else if (!strcmp(name, "flag")) {
+		} else if (atom_type == ATOM_FLAG) {
 			char buf[256], *cp = buf;
 			if (ref->flag & REF_ISSYMREF)
 				cp = copy_advance(cp, ",symref");
@@ -1772,23 +1843,24 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 				v->s = xstrdup(buf + 1);
 			}
 			continue;
-		} else if (!deref && grab_oid(name, "objectname", &ref->objectname, v, atom)) {
-			continue;
-		} else if (!strcmp(name, "HEAD")) {
+		} else if (!deref && atom_type == ATOM_OBJECTNAME &&
+			   grab_oid(name, "objectname", &ref->objectname, v, atom)) {
+				continue;
+		} else if (atom_type == ATOM_HEAD) {
 			if (atom->u.head && !strcmp(ref->refname, atom->u.head))
 				v->s = xstrdup("*");
 			else
 				v->s = xstrdup(" ");
 			continue;
-		} else if (starts_with(name, "align")) {
+		} else if (atom_type == ATOM_ALIGN) {
 			v->handler = align_atom_handler;
 			v->s = xstrdup("");
 			continue;
-		} else if (!strcmp(name, "end")) {
+		} else if (atom_type == ATOM_END) {
 			v->handler = end_atom_handler;
 			v->s = xstrdup("");
 			continue;
-		} else if (starts_with(name, "if")) {
+		} else if (atom_type == ATOM_IF) {
 			const char *s;
 			if (skip_prefix(name, "if:", &s))
 				v->s = xstrdup(s);
@@ -1796,11 +1868,11 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 				v->s = xstrdup("");
 			v->handler = if_atom_handler;
 			continue;
-		} else if (!strcmp(name, "then")) {
+		} else if (atom_type == ATOM_THEN) {
 			v->handler = then_atom_handler;
 			v->s = xstrdup("");
 			continue;
-		} else if (!strcmp(name, "else")) {
+		} else if (atom_type == ATOM_ELSE) {
 			v->handler = else_atom_handler;
 			v->s = xstrdup("");
 			continue;
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 2/2] [GSOC] ref-filter: introduce enum atom_type
  2021-05-10 15:03   ` [PATCH v2 2/2] [GSOC] ref-filter: introduce enum atom_type ZheNing Hu via GitGitGadget
@ 2021-05-11  2:14     ` Junio C Hamano
  2021-05-11  5:51       ` Christian Couder
  2021-05-11 12:18       ` ZheNing Hu
  0 siblings, 2 replies; 27+ messages in thread
From: Junio C Hamano @ 2021-05-11  2:14 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Jeff King, Hariom Verma, Christian Couder, ZheNing Hu

"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:

> the enum value of `ATOM_UNKNOWN` is equals to zero, which

s/the/The/;

> could ensure that we can easily distinguish such a struct
> where the atom_type is known from such a struct where it
> is unknown yet.
>
> the enum value of `ATOM_INVALID` is equals to the size of

Ditto.

> +/*
> + * The enum atom_type is used as the coordinates of valid_atom entry.
> + * In the atom parsing stage, it will be passed to used_atom.atom_type
> + * as the identifier of the atom type. We can judge the type of used_atom
> + * entry by `if (used_atom[i].atom_type == ATOM_*)`.
> + *
> + * ATOM_UNKNOWN equals to 0, used as an enumeration value of uninitialized
> + * atom_type.

Shouldn't it be (-1)?

And I'd assume I am right in the following.

> + * ATOM_INVALID equals to the size of valid_atom array, which could help us
> + * iterate over valid_atom array like this:
> + *
> + * 	for (i = ATOM_UNKNOWN + 1; i < ATOM_INVALID; i++) {

I find it far more intuitive to say

	for (i = 0; i < ATOM_INVALID; i++)

than having to say UNKNOWN+1.

In any case, the values should be indented, and a comment should
ensure that the final one stays at the end, perhaps like this.

	enum atom_type {
		ATOM_INVALID = -2,
		ATOM_UNKNOWN = -1,
		ATOM_REFNAME,
		...
		ATOM_ELSE,
		ATOM_MAX /* MUST BE AT THE END */
	}

(note that the trailing comma is deliberately omitted).

It would allow people to say

	for (i = 0; i < ATOM_MAX; i++)

instead, which would be even nicer.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 2/2] [GSOC] ref-filter: introduce enum atom_type
  2021-05-11  2:14     ` Junio C Hamano
@ 2021-05-11  5:51       ` Christian Couder
  2021-05-11  6:12         ` Junio C Hamano
                           ` (2 more replies)
  2021-05-11 12:18       ` ZheNing Hu
  1 sibling, 3 replies; 27+ messages in thread
From: Christian Couder @ 2021-05-11  5:51 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: ZheNing Hu via GitGitGadget, git, Jeff King, Hariom Verma, ZheNing Hu

On Tue, May 11, 2021 at 4:14 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:

> > +/*
> > + * The enum atom_type is used as the coordinates of valid_atom entry.
> > + * In the atom parsing stage, it will be passed to used_atom.atom_type
> > + * as the identifier of the atom type. We can judge the type of used_atom
> > + * entry by `if (used_atom[i].atom_type == ATOM_*)`.
> > + *
> > + * ATOM_UNKNOWN equals to 0, used as an enumeration value of uninitialized
> > + * atom_type.
>
> Shouldn't it be (-1)?

If it's -1 instead of 0, then it might be a bit more complex to
initialize structs that contain such a field, as it cannot be done
with only xcalloc().

> And I'd assume I am right in the following.
>
> > + * ATOM_INVALID equals to the size of valid_atom array, which could help us
> > + * iterate over valid_atom array like this:
> > + *
> > + *   for (i = ATOM_UNKNOWN + 1; i < ATOM_INVALID; i++) {
>
> I find it far more intuitive to say
>
>         for (i = 0; i < ATOM_INVALID; i++)
>
> than having to say UNKNOWN+1.

Yeah, that's more intuitive. But in my opinion, using `ATOM_UNKNOWN +
1` instead of `0` at least shouldn't often result in more lines of
code, and should be a bit easier to get right, compared to having to
initialize the field with ATOM_UNKNOWN.

> In any case, the values should be indented, and a comment should
> ensure that the final one stays at the end, perhaps like this.
>
>         enum atom_type {
>                 ATOM_INVALID = -2,
>                 ATOM_UNKNOWN = -1,
>                 ATOM_REFNAME,
>                 ...
>                 ATOM_ELSE,
>                 ATOM_MAX /* MUST BE AT THE END */

I agree that a comment telling people that it must be at the end is good.

>         }
>
> (note that the trailing comma is deliberately omitted).

Yeah.

> It would allow people to say
>
>         for (i = 0; i < ATOM_MAX; i++)
>
> instead, which would be even nicer.

Yeah, it's also a tradeoff to have the last one called ATOM_MAX
instead of ATOM_INVALID, and to have a separate ATOM_INVALID if it's
needed.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 2/2] [GSOC] ref-filter: introduce enum atom_type
  2021-05-11  5:51       ` Christian Couder
@ 2021-05-11  6:12         ` Junio C Hamano
  2021-05-11 12:53           ` ZheNing Hu
  2021-05-11 12:37         ` ZheNing Hu
  2021-05-11 13:05         ` Junio C Hamano
  2 siblings, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2021-05-11  6:12 UTC (permalink / raw)
  To: Christian Couder
  Cc: ZheNing Hu via GitGitGadget, git, Jeff King, Hariom Verma, ZheNing Hu

Christian Couder <christian.couder@gmail.com> writes:

>> I find it far more intuitive to say
>>
>>         for (i = 0; i < ATOM_INVALID; i++)
>>
>> than having to say UNKNOWN+1.
>
> Yeah, that's more intuitive. But in my opinion, using `ATOM_UNKNOWN +
> 1` instead of `0` at least shouldn't often result in more lines of
> code, and should be a bit easier to get right, compared to having to
> initialize the field with ATOM_UNKNOWN.

Number of lines is not all that important.

But the developers must remember that UNKNOWN is at the bottom end
and INVALID is at the top end, which is very taxing.  Tying UNKNOWN
to the top end and INVALID to the bottom end would equally be
plausible and there is no memory aid to help us remember which one
is which.  Compare it to "array indices begin at 0, and the upper
end is MAX".  Your scheme is much easier for developers to screw up.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 2/2] [GSOC] ref-filter: introduce enum atom_type
  2021-05-11  2:14     ` Junio C Hamano
  2021-05-11  5:51       ` Christian Couder
@ 2021-05-11 12:18       ` ZheNing Hu
  1 sibling, 0 replies; 27+ messages in thread
From: ZheNing Hu @ 2021-05-11 12:18 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: ZheNing Hu via GitGitGadget, Git List, Jeff King, Hariom Verma,
	Christian Couder

> And I'd assume I am right in the following.
>
> > + * ATOM_INVALID equals to the size of valid_atom array, which could help us
> > + * iterate over valid_atom array like this:
> > + *
> > + *   for (i = ATOM_UNKNOWN + 1; i < ATOM_INVALID; i++) {
>
> I find it far more intuitive to say
>
>         for (i = 0; i < ATOM_INVALID; i++)
>
> than having to say UNKNOWN+1.
>
> In any case, the values should be indented, and a comment should
> ensure that the final one stays at the end, perhaps like this.
>
>         enum atom_type {
>                 ATOM_INVALID = -2,
>                 ATOM_UNKNOWN = -1,
>                 ATOM_REFNAME,
>                 ...
>                 ATOM_ELSE,
>                 ATOM_MAX /* MUST BE AT THE END */
>         }
>
> (note that the trailing comma is deliberately omitted).
>
> It would allow people to say
>
>         for (i = 0; i < ATOM_MAX; i++)
>
> instead, which would be even nicer.

I think ATOM_INVALID and ATOM_MAX all will have a
similar effort. Why don't we omit one of them?

For the time being, all the used_atom entry create in
`parse_ref_filter_atom()`, we directly use
`used_atom[at].atom_type = i;` after realloc the used_atom.
There is no time for "ATOM_UNKNOWN" at all.

I don’t know if it makes a lot of sense use "ATOM_UNKNOWN"
at the moment.

--
ZheNing Hu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 2/2] [GSOC] ref-filter: introduce enum atom_type
  2021-05-11  5:51       ` Christian Couder
  2021-05-11  6:12         ` Junio C Hamano
@ 2021-05-11 12:37         ` ZheNing Hu
  2021-05-11 13:05         ` Junio C Hamano
  2 siblings, 0 replies; 27+ messages in thread
From: ZheNing Hu @ 2021-05-11 12:37 UTC (permalink / raw)
  To: Christian Couder
  Cc: Junio C Hamano, ZheNing Hu via GitGitGadget, git, Jeff King,
	Hariom Verma

Christian Couder <christian.couder@gmail.com> 于2021年5月11日周二 下午1:51写道:
>
> >
> > Shouldn't it be (-1)?
>
> If it's -1 instead of 0, then it might be a bit more complex to
> initialize structs that contain such a field, as it cannot be done
> with only xcalloc().
>

I agree. If the traverse start from 0, an init atom_type will have
"ATOM_REFNAME" junk value. If let users manually adjust it to
ATOM_UNKNOWN, it seems to be very troublesome.

> > And I'd assume I am right in the following.
> >
> > > + * ATOM_INVALID equals to the size of valid_atom array, which could help us
> > > + * iterate over valid_atom array like this:
> > > + *
> > > + *   for (i = ATOM_UNKNOWN + 1; i < ATOM_INVALID; i++) {
> >
> > I find it far more intuitive to say
> >
> >         for (i = 0; i < ATOM_INVALID; i++)
> >
> > than having to say UNKNOWN+1.
>
> Yeah, that's more intuitive. But in my opinion, using `ATOM_UNKNOWN +
> 1` instead of `0` at least shouldn't often result in more lines of
> code, and should be a bit easier to get right, compared to having to
> initialize the field with ATOM_UNKNOWN.
>

There will be a trade-off. Traverse from 0 or does not need to adjust the
initialized atom_type = UNKNOWN.

>
> > It would allow people to say
> >
> >         for (i = 0; i < ATOM_MAX; i++)
> >
> > instead, which would be even nicer.
>
> Yeah, it's also a tradeoff to have the last one called ATOM_MAX
> instead of ATOM_INVALID, and to have a separate ATOM_INVALID if it's
> needed.

About ATOM_MAX or ATOM_INVALID, I have a idea:

enum atom_type {
ATOM_UNKNOWN,
...
ATOM_ELSE,
ATOM_INVALID,
+ATOM_MAX = ATOM_INVALID
 };

This might be able to do both.

Thanks.
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 2/2] [GSOC] ref-filter: introduce enum atom_type
  2021-05-11  6:12         ` Junio C Hamano
@ 2021-05-11 12:53           ` ZheNing Hu
  0 siblings, 0 replies; 27+ messages in thread
From: ZheNing Hu @ 2021-05-11 12:53 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Christian Couder, ZheNing Hu via GitGitGadget, git, Jeff King,
	Hariom Verma

Junio C Hamano <gitster@pobox.com> 于2021年5月11日周二 下午2:12写道:
>
> Christian Couder <christian.couder@gmail.com> writes:
>
> >> I find it far more intuitive to say
> >>
> >>         for (i = 0; i < ATOM_INVALID; i++)
> >>
> >> than having to say UNKNOWN+1.
> >
> > Yeah, that's more intuitive. But in my opinion, using `ATOM_UNKNOWN +
> > 1` instead of `0` at least shouldn't often result in more lines of
> > code, and should be a bit easier to get right, compared to having to
> > initialize the field with ATOM_UNKNOWN.
>
> Number of lines is not all that important.
>
> But the developers must remember that UNKNOWN is at the bottom end
> and INVALID is at the top end, which is very taxing.  Tying UNKNOWN
> to the top end and INVALID to the bottom end would equally be
> plausible and there is no memory aid to help us remember which one
> is which.  Compare it to "array indices begin at 0, and the upper
> end is MAX".  Your scheme is much easier for developers to screw up.
>

Yes, UNKNOWN + 1 is difficult to use. But using UNKNOWN = -1,
this means that the coder may indirectly use an init atom_type with
junk value "ATOM_REFNAME", they maybe did't notice they need
reinitialize the value to UNKNOWN.

I thought that perhaps such a naming would be better:

ATOM_BEGIN = ATOM_UNKNOWN + 1,
ATOM_END = ATOM_INVALID

       for (i = ATOM_BEGIN; i < ATOM_END; i++) {
       }

But ATOM_END has been used...

--
ZheNing Hu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 2/2] [GSOC] ref-filter: introduce enum atom_type
  2021-05-11  5:51       ` Christian Couder
  2021-05-11  6:12         ` Junio C Hamano
  2021-05-11 12:37         ` ZheNing Hu
@ 2021-05-11 13:05         ` Junio C Hamano
  2 siblings, 0 replies; 27+ messages in thread
From: Junio C Hamano @ 2021-05-11 13:05 UTC (permalink / raw)
  To: Christian Couder
  Cc: ZheNing Hu via GitGitGadget, git, Jeff King, Hariom Verma, ZheNing Hu

Christian Couder <christian.couder@gmail.com> writes:

> If it's -1 instead of 0, then it might be a bit more complex to
> initialize structs that contain such a field, as it cannot be done
> with only xcalloc().

In general, yes, but I would have thought that the codepath that
allocates used_atom[] elements are pretty isolated---it is not like
there are random xcalloc() all over the code.  In fact, there is
only one "used_atom_cnt++" in the whole file, which means there is
only one place that (re)allocates the array.

	...
	at = used_atom_cnt;
	used_atom_cnt++;
	REALLOC_ARRAY(used_atom, used_atom_cnt);
	used_atom[at].name = xmemdupz(atom, ep - atom);
	used_atom[at].type = valid_atom[i].cmp_type;
	used_atom[at].source = valid_atom[i].source;
	...

So, I do not think there is even any need to worry about "initialize
to invalid and fill it in as we discover what it really is"; if
there were such a use pattern, UNKNOWN would be handy, but that is
not what we are dealing with here.  In the above snippet, we already
have found from which valid_atom[] element to instantiate the new
element in used_atom[] array.

>> > + * ATOM_INVALID equals to the size of valid_atom array, which could help us
>> > + * iterate over valid_atom array like this:
>> > + *
>> > + *   for (i = ATOM_UNKNOWN + 1; i < ATOM_INVALID; i++) {
>>
>> I find it far more intuitive to say
>>
>>         for (i = 0; i < ATOM_INVALID; i++)
>>
>> than having to say UNKNOWN+1.

And here I was being embarrassingly silly.

As long as we do not waste any entry in valid_atom[] with leading
gap, trailing gap or gap in the middle, the way to iterate over such
an array is

	for (i = 0; i < ARRAY_SIZE(valid_atom); i++)

hence, there is no need for ATOM_MAX, and no need to burden us to
remember that UNKNOWN is near the bottom of the range, and INVALID
is near the top of the range.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v3 0/2] [GSOC][RFC] ref-filter: introduce enum atom_type
  2021-05-10 15:03 ` [PATCH v2 0/2] " ZheNing Hu via GitGitGadget
  2021-05-10 15:03   ` [PATCH v2 1/2] [GSOC] ref-filter: add objectsize to used_atom ZheNing Hu via GitGitGadget
  2021-05-10 15:03   ` [PATCH v2 2/2] [GSOC] ref-filter: introduce enum atom_type ZheNing Hu via GitGitGadget
@ 2021-05-12 12:11   ` ZheNing Hu via GitGitGadget
  2021-05-12 12:11     ` [PATCH v3 1/2] [GSOC] ref-filter: add objectsize to used_atom ZheNing Hu via GitGitGadget
                       ` (2 more replies)
  2 siblings, 3 replies; 27+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-05-12 12:11 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Hariom Verma, Christian Couder, ZheNing Hu

Change from last version: Since used_atom obtains the index from valid_atom
as its atom_type after init in parse_ref_filter_atom(), here is the only
place where used_atom is initialized in the global, ATOM_INVALID and
ATOM_UNKNOWN seem to cause unnecessary trouble.

So remove ATOM_INVALID and ATOM_UNKNOWN, Use the original method of
traversing valid_atom.

ZheNing Hu (2):
  [GSOC] ref-filter: add objectsize to used_atom
  [GSOC] ref-filter: introduce enum atom_type

 ref-filter.c | 214 +++++++++++++++++++++++++++++++++------------------
 1 file changed, 140 insertions(+), 74 deletions(-)


base-commit: 7e391989789db82983665667013a46eabc6fc570
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-951%2Fadlternative%2Fref-filter-atom-type-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-951/adlternative/ref-filter-atom-type-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/951

Range-diff vs v2:

 1:  91ca57c9d04a = 1:  91ca57c9d04a [GSOC] ref-filter: add objectsize to used_atom
 2:  a1f70b39b7ef ! 2:  43400cac58e7 [GSOC] ref-filter: introduce enum atom_type
     @@ Commit message
          `used_atom.atom_type` will record corresponding enum value
          from valid_atom entry index, and then in specific reference
          attribute filling step, only need to compare the value of
     -    the `used_atom.atom_type` to judge the atom type.
     -
     -    the enum value of `ATOM_UNKNOWN` is equals to zero, which
     -    could ensure that we can easily distinguish such a struct
     -    where the atom_type is known from such a struct where it
     -    is unknown yet.
     -
     -    the enum value of `ATOM_INVALID` is equals to the size of
     -    valid_atom array, which could help us iterate over
     -    valid_atom array using something like:
     -
     -    for (i = ATOM_UNKNOWN + 1; i < ATOM_INVALID; i++)
     -            /* do something with valid_atom[i] */;
     +    the `used_atom[i].atom_type` to judge the atom type.
      
     +    Helped-by: Junio C Hamano <gitster@pobox.com>
     +    Helped-by: Christian Couder <christian.couder@gmail.com>
          Signed-off-by: ZheNing Hu <adlternative@gmail.com>
      
       ## ref-filter.c ##
     @@ ref-filter.c: static struct ref_to_worktree_map {
      + * In the atom parsing stage, it will be passed to used_atom.atom_type
      + * as the identifier of the atom type. We can judge the type of used_atom
      + * entry by `if (used_atom[i].atom_type == ATOM_*)`.
     -+ *
     -+ * ATOM_UNKNOWN equals to 0, used as an enumeration value of uninitialized
     -+ * atom_type.
     -+ * ATOM_INVALID equals to the size of valid_atom array, which could help us
     -+ * iterate over valid_atom array like this:
     -+ *
     -+ * 	for (i = ATOM_UNKNOWN + 1; i < ATOM_INVALID; i++) {
     -+ *		int len = strlen(valid_atom[i].name);
     -+ *		if (len == atom_len && !memcmp(valid_atom[i].name, sp, len))
     -+ *			break;
     -+ *	}
      + */
      +enum atom_type {
     -+ATOM_UNKNOWN,
     -+ATOM_REFNAME,
     -+ATOM_OBJECTTYPE,
     -+ATOM_OBJECTSIZE,
     -+ATOM_OBJECTNAME,
     -+ATOM_DELTABASE,
     -+ATOM_TREE,
     -+ATOM_PARENT,
     -+ATOM_NUMPARENT,
     -+ATOM_OBJECT,
     -+ATOM_TYPE,
     -+ATOM_TAG,
     -+ATOM_AUTHOR,
     -+ATOM_AUTHORNAME,
     -+ATOM_AUTHOREMAIL,
     -+ATOM_AUTHORDATE,
     -+ATOM_COMMITTER,
     -+ATOM_COMMITTERNAME,
     -+ATOM_COMMITTEREMAIL,
     -+ATOM_COMMITTERDATE,
     -+ATOM_TAGGER,
     -+ATOM_TAGGERNAME,
     -+ATOM_TAGGEREMAIL,
     -+ATOM_TAGGERDATE,
     -+ATOM_CREATOR,
     -+ATOM_CREATORDATE,
     -+ATOM_SUBJECT,
     -+ATOM_BODY,
     -+ATOM_TRAILERS,
     -+ATOM_CONTENTS,
     -+ATOM_UPSTREAM,
     -+ATOM_PUSH,
     -+ATOM_SYMREF,
     -+ATOM_FLAG,
     -+ATOM_HEAD,
     -+ATOM_COLOR,
     -+ATOM_WORKTREEPATH,
     -+ATOM_ALIGN,
     -+ATOM_END,
     -+ATOM_IF,
     -+ATOM_THEN,
     -+ATOM_ELSE,
     -+ATOM_INVALID,
     ++	ATOM_REFNAME,
     ++	ATOM_OBJECTTYPE,
     ++	ATOM_OBJECTSIZE,
     ++	ATOM_OBJECTNAME,
     ++	ATOM_DELTABASE,
     ++	ATOM_TREE,
     ++	ATOM_PARENT,
     ++	ATOM_NUMPARENT,
     ++	ATOM_OBJECT,
     ++	ATOM_TYPE,
     ++	ATOM_TAG,
     ++	ATOM_AUTHOR,
     ++	ATOM_AUTHORNAME,
     ++	ATOM_AUTHOREMAIL,
     ++	ATOM_AUTHORDATE,
     ++	ATOM_COMMITTER,
     ++	ATOM_COMMITTERNAME,
     ++	ATOM_COMMITTEREMAIL,
     ++	ATOM_COMMITTERDATE,
     ++	ATOM_TAGGER,
     ++	ATOM_TAGGERNAME,
     ++	ATOM_TAGGEREMAIL,
     ++	ATOM_TAGGERDATE,
     ++	ATOM_CREATOR,
     ++	ATOM_CREATORDATE,
     ++	ATOM_SUBJECT,
     ++	ATOM_BODY,
     ++	ATOM_TRAILERS,
     ++	ATOM_CONTENTS,
     ++	ATOM_UPSTREAM,
     ++	ATOM_PUSH,
     ++	ATOM_SYMREF,
     ++	ATOM_FLAG,
     ++	ATOM_HEAD,
     ++	ATOM_COLOR,
     ++	ATOM_WORKTREEPATH,
     ++	ATOM_ALIGN,
     ++	ATOM_END,
     ++	ATOM_IF,
     ++	ATOM_THEN,
     ++	ATOM_ELSE,
      +};
      +
       /*
     @@ ref-filter.c: static struct {
       	/*
       	 * Please update $__git_ref_fieldlist in git-completion.bash
       	 * when you add new atoms
     -@@ ref-filter.c: static int parse_ref_filter_atom(const struct ref_format *format,
     - 	atom_len = (arg ? arg : ep) - sp;
     - 
     - 	/* Is the atom a valid one? */
     --	for (i = 0; i < ARRAY_SIZE(valid_atom); i++) {
     -+	for (i = ATOM_UNKNOWN + 1; i < ATOM_INVALID; i++) {
     - 		int len = strlen(valid_atom[i].name);
     - 		if (len == atom_len && !memcmp(valid_atom[i].name, sp, len))
     - 			break;
     - 	}
     - 
     --	if (ARRAY_SIZE(valid_atom) <= i)
     -+	if (i == ATOM_INVALID)
     - 		return strbuf_addf_ret(err, -1, _("unknown field name: %.*s"),
     - 				       (int)(ep-atom), atom);
     - 	if (valid_atom[i].source != SOURCE_NONE && !have_git_dir())
      @@ ref-filter.c: static int parse_ref_filter_atom(const struct ref_format *format,
       	at = used_atom_cnt;
       	used_atom_cnt++;

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v3 1/2] [GSOC] ref-filter: add objectsize to used_atom
  2021-05-12 12:11   ` [PATCH v3 0/2] [GSOC][RFC] " ZheNing Hu via GitGitGadget
@ 2021-05-12 12:11     ` ZheNing Hu via GitGitGadget
  2021-05-12 23:11       ` Junio C Hamano
  2021-05-12 12:11     ` [PATCH v3 2/2] [GSOC] ref-filter: introduce enum atom_type ZheNing Hu via GitGitGadget
  2021-05-13 15:15     ` [PATCH v4 0/2] [GSOC][RFC] " ZheNing Hu via GitGitGadget
  2 siblings, 1 reply; 27+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-05-12 12:11 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Hariom Verma, Christian Couder,
	ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

Since "objectsize:size" is composed of two parts,
"type:attribute". However, the original implementation
did not decouple the two parts "type" and "attribute" well,
we still need to judge separately whether the atom is
"objectsize" or "objectsize:disk" in `grab_common_values()`.

Add a new member `objectsize` to the union `used_atom.u`,
so that we can separate the judgment of atom type from the
judgment of atom attribute, This will bring scalability to
atom `%(objectsize)`.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 ref-filter.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index a0adb4551d87..f420bae6e5ba 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -146,6 +146,9 @@ static struct used_atom {
 			enum { O_FULL, O_LENGTH, O_SHORT } option;
 			unsigned int length;
 		} oid;
+		struct {
+			enum { O_SIZE, O_SIZE_DISK } option;
+		} objectsize;
 		struct email_option {
 			enum { EO_RAW, EO_TRIM, EO_LOCALPART } option;
 		} email_option;
@@ -269,11 +272,13 @@ static int objectsize_atom_parser(const struct ref_format *format, struct used_a
 				  const char *arg, struct strbuf *err)
 {
 	if (!arg) {
+		atom->u.objectsize.option = O_SIZE;
 		if (*atom->name == '*')
 			oi_deref.info.sizep = &oi_deref.size;
 		else
 			oi.info.sizep = &oi.size;
 	} else if (!strcmp(arg, "disk")) {
+		atom->u.objectsize.option = O_SIZE_DISK;
 		if (*atom->name == '*')
 			oi_deref.info.disk_sizep = &oi_deref.disk_size;
 		else
@@ -967,12 +972,14 @@ static void grab_common_values(struct atom_value *val, int deref, struct expand_
 			name++;
 		if (!strcmp(name, "objecttype"))
 			v->s = xstrdup(type_name(oi->type));
-		else if (!strcmp(name, "objectsize:disk")) {
-			v->value = oi->disk_size;
-			v->s = xstrfmt("%"PRIuMAX, (uintmax_t)oi->disk_size);
-		} else if (!strcmp(name, "objectsize")) {
-			v->value = oi->size;
-			v->s = xstrfmt("%"PRIuMAX , (uintmax_t)oi->size);
+		else if (starts_with(name, "objectsize")) {
+			if (used_atom[i].u.objectsize.option == O_SIZE_DISK) {
+				v->value = oi->disk_size;
+				v->s = xstrfmt("%"PRIuMAX, (uintmax_t)oi->disk_size);
+			} else if (used_atom[i].u.objectsize.option == O_SIZE) {
+				v->value = oi->size;
+				v->s = xstrfmt("%"PRIuMAX , (uintmax_t)oi->size);
+			}
 		} else if (!strcmp(name, "deltabase"))
 			v->s = xstrdup(oid_to_hex(&oi->delta_base_oid));
 		else if (deref)
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v3 2/2] [GSOC] ref-filter: introduce enum atom_type
  2021-05-12 12:11   ` [PATCH v3 0/2] [GSOC][RFC] " ZheNing Hu via GitGitGadget
  2021-05-12 12:11     ` [PATCH v3 1/2] [GSOC] ref-filter: add objectsize to used_atom ZheNing Hu via GitGitGadget
@ 2021-05-12 12:11     ` ZheNing Hu via GitGitGadget
  2021-05-12 23:21       ` Junio C Hamano
  2021-05-13 15:15     ` [PATCH v4 0/2] [GSOC][RFC] " ZheNing Hu via GitGitGadget
  2 siblings, 1 reply; 27+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-05-12 12:11 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Hariom Verma, Christian Couder,
	ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

In the original ref-filter design, it will copy the parsed
atom's name and attributes to `used_atom[i].name` in the
atom's parsing step, and use it again for string matching
in the later specific ref attributes filling step. It use
a lot of string matching to determine which atom we need.

Introduce the enum "atom_type", each enum value is named
as `ATOM_*`, which is the index of each corresponding
valid_atom entry. In the first step of the atom parsing,
`used_atom.atom_type` will record corresponding enum value
from valid_atom entry index, and then in specific reference
attribute filling step, only need to compare the value of
the `used_atom[i].atom_type` to judge the atom type.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 ref-filter.c | 197 +++++++++++++++++++++++++++++++++------------------
 1 file changed, 128 insertions(+), 69 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index f420bae6e5ba..cf36716b76e3 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -108,6 +108,56 @@ static struct ref_to_worktree_map {
 	struct worktree **worktrees;
 } ref_to_worktree_map;
 
+/*
+ * The enum atom_type is used as the coordinates of valid_atom entry.
+ * In the atom parsing stage, it will be passed to used_atom.atom_type
+ * as the identifier of the atom type. We can judge the type of used_atom
+ * entry by `if (used_atom[i].atom_type == ATOM_*)`.
+ */
+enum atom_type {
+	ATOM_REFNAME,
+	ATOM_OBJECTTYPE,
+	ATOM_OBJECTSIZE,
+	ATOM_OBJECTNAME,
+	ATOM_DELTABASE,
+	ATOM_TREE,
+	ATOM_PARENT,
+	ATOM_NUMPARENT,
+	ATOM_OBJECT,
+	ATOM_TYPE,
+	ATOM_TAG,
+	ATOM_AUTHOR,
+	ATOM_AUTHORNAME,
+	ATOM_AUTHOREMAIL,
+	ATOM_AUTHORDATE,
+	ATOM_COMMITTER,
+	ATOM_COMMITTERNAME,
+	ATOM_COMMITTEREMAIL,
+	ATOM_COMMITTERDATE,
+	ATOM_TAGGER,
+	ATOM_TAGGERNAME,
+	ATOM_TAGGEREMAIL,
+	ATOM_TAGGERDATE,
+	ATOM_CREATOR,
+	ATOM_CREATORDATE,
+	ATOM_SUBJECT,
+	ATOM_BODY,
+	ATOM_TRAILERS,
+	ATOM_CONTENTS,
+	ATOM_UPSTREAM,
+	ATOM_PUSH,
+	ATOM_SYMREF,
+	ATOM_FLAG,
+	ATOM_HEAD,
+	ATOM_COLOR,
+	ATOM_WORKTREEPATH,
+	ATOM_ALIGN,
+	ATOM_END,
+	ATOM_IF,
+	ATOM_THEN,
+	ATOM_ELSE,
+};
+
 /*
  * An atom is a valid field atom listed below, possibly prefixed with
  * a "*" to denote deref_tag().
@@ -119,6 +169,7 @@ static struct ref_to_worktree_map {
  * array.
  */
 static struct used_atom {
+	enum atom_type atom_type;
 	const char *name;
 	cmp_type type;
 	info_source source;
@@ -506,47 +557,47 @@ static struct {
 	int (*parser)(const struct ref_format *format, struct used_atom *atom,
 		      const char *arg, struct strbuf *err);
 } valid_atom[] = {
-	{ "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser },
-	{ "objecttype", SOURCE_OTHER, FIELD_STR, objecttype_atom_parser },
-	{ "objectsize", SOURCE_OTHER, FIELD_ULONG, objectsize_atom_parser },
-	{ "objectname", SOURCE_OTHER, FIELD_STR, oid_atom_parser },
-	{ "deltabase", SOURCE_OTHER, FIELD_STR, deltabase_atom_parser },
-	{ "tree", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
-	{ "parent", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
-	{ "numparent", SOURCE_OBJ, FIELD_ULONG },
-	{ "object", SOURCE_OBJ },
-	{ "type", SOURCE_OBJ },
-	{ "tag", SOURCE_OBJ },
-	{ "author", SOURCE_OBJ },
-	{ "authorname", SOURCE_OBJ },
-	{ "authoremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
-	{ "authordate", SOURCE_OBJ, FIELD_TIME },
-	{ "committer", SOURCE_OBJ },
-	{ "committername", SOURCE_OBJ },
-	{ "committeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
-	{ "committerdate", SOURCE_OBJ, FIELD_TIME },
-	{ "tagger", SOURCE_OBJ },
-	{ "taggername", SOURCE_OBJ },
-	{ "taggeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
-	{ "taggerdate", SOURCE_OBJ, FIELD_TIME },
-	{ "creator", SOURCE_OBJ },
-	{ "creatordate", SOURCE_OBJ, FIELD_TIME },
-	{ "subject", SOURCE_OBJ, FIELD_STR, subject_atom_parser },
-	{ "body", SOURCE_OBJ, FIELD_STR, body_atom_parser },
-	{ "trailers", SOURCE_OBJ, FIELD_STR, trailers_atom_parser },
-	{ "contents", SOURCE_OBJ, FIELD_STR, contents_atom_parser },
-	{ "upstream", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
-	{ "push", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
-	{ "symref", SOURCE_NONE, FIELD_STR, refname_atom_parser },
-	{ "flag", SOURCE_NONE },
-	{ "HEAD", SOURCE_NONE, FIELD_STR, head_atom_parser },
-	{ "color", SOURCE_NONE, FIELD_STR, color_atom_parser },
-	{ "worktreepath", SOURCE_NONE },
-	{ "align", SOURCE_NONE, FIELD_STR, align_atom_parser },
-	{ "end", SOURCE_NONE },
-	{ "if", SOURCE_NONE, FIELD_STR, if_atom_parser },
-	{ "then", SOURCE_NONE },
-	{ "else", SOURCE_NONE },
+	[ATOM_REFNAME] = { "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser },
+	[ATOM_OBJECTTYPE] = { "objecttype", SOURCE_OTHER, FIELD_STR, objecttype_atom_parser },
+	[ATOM_OBJECTSIZE] = { "objectsize", SOURCE_OTHER, FIELD_ULONG, objectsize_atom_parser },
+	[ATOM_OBJECTNAME] = { "objectname", SOURCE_OTHER, FIELD_STR, oid_atom_parser },
+	[ATOM_DELTABASE] = { "deltabase", SOURCE_OTHER, FIELD_STR, deltabase_atom_parser },
+	[ATOM_TREE] = { "tree", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
+	[ATOM_PARENT] = { "parent", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
+	[ATOM_NUMPARENT] = { "numparent", SOURCE_OBJ, FIELD_ULONG },
+	[ATOM_OBJECT] = { "object", SOURCE_OBJ },
+	[ATOM_TYPE] = { "type", SOURCE_OBJ },
+	[ATOM_TAG] = { "tag", SOURCE_OBJ },
+	[ATOM_AUTHOR] = { "author", SOURCE_OBJ },
+	[ATOM_AUTHORNAME] = { "authorname", SOURCE_OBJ },
+	[ATOM_AUTHOREMAIL] = { "authoremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
+	[ATOM_AUTHORDATE] = { "authordate", SOURCE_OBJ, FIELD_TIME },
+	[ATOM_COMMITTER] = { "committer", SOURCE_OBJ },
+	[ATOM_COMMITTERNAME] = { "committername", SOURCE_OBJ },
+	[ATOM_COMMITTEREMAIL] = { "committeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
+	[ATOM_COMMITTERDATE] = { "committerdate", SOURCE_OBJ, FIELD_TIME },
+	[ATOM_TAGGER] = { "tagger", SOURCE_OBJ },
+	[ATOM_TAGGERNAME] = { "taggername", SOURCE_OBJ },
+	[ATOM_TAGGEREMAIL] = { "taggeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
+	[ATOM_TAGGERDATE] = { "taggerdate", SOURCE_OBJ, FIELD_TIME },
+	[ATOM_CREATOR] = { "creator", SOURCE_OBJ },
+	[ATOM_CREATORDATE] = { "creatordate", SOURCE_OBJ, FIELD_TIME },
+	[ATOM_SUBJECT] = { "subject", SOURCE_OBJ, FIELD_STR, subject_atom_parser },
+	[ATOM_BODY] = { "body", SOURCE_OBJ, FIELD_STR, body_atom_parser },
+	[ATOM_TRAILERS] = { "trailers", SOURCE_OBJ, FIELD_STR, trailers_atom_parser },
+	[ATOM_CONTENTS] = { "contents", SOURCE_OBJ, FIELD_STR, contents_atom_parser },
+	[ATOM_UPSTREAM] = { "upstream", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
+	[ATOM_PUSH] = { "push", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
+	[ATOM_SYMREF] = { "symref", SOURCE_NONE, FIELD_STR, refname_atom_parser },
+	[ATOM_FLAG] = { "flag", SOURCE_NONE },
+	[ATOM_HEAD] = { "HEAD", SOURCE_NONE, FIELD_STR, head_atom_parser },
+	[ATOM_COLOR] = { "color", SOURCE_NONE, FIELD_STR, color_atom_parser },
+	[ATOM_WORKTREEPATH] = { "worktreepath", SOURCE_NONE },
+	[ATOM_ALIGN] = { "align", SOURCE_NONE, FIELD_STR, align_atom_parser },
+	[ATOM_END] = { "end", SOURCE_NONE },
+	[ATOM_IF] = { "if", SOURCE_NONE, FIELD_STR, if_atom_parser },
+	[ATOM_THEN] = { "then", SOURCE_NONE },
+	[ATOM_ELSE] = { "else", SOURCE_NONE },
 	/*
 	 * Please update $__git_ref_fieldlist in git-completion.bash
 	 * when you add new atoms
@@ -628,6 +679,7 @@ static int parse_ref_filter_atom(const struct ref_format *format,
 	at = used_atom_cnt;
 	used_atom_cnt++;
 	REALLOC_ARRAY(used_atom, used_atom_cnt);
+	used_atom[at].atom_type = i;
 	used_atom[at].name = xmemdupz(atom, ep - atom);
 	used_atom[at].type = valid_atom[i].cmp_type;
 	used_atom[at].source = valid_atom[i].source;
@@ -652,7 +704,7 @@ static int parse_ref_filter_atom(const struct ref_format *format,
 		return -1;
 	if (*atom == '*')
 		need_tagged = 1;
-	if (!strcmp(valid_atom[i].name, "symref"))
+	if (i == ATOM_SYMREF)
 		need_symref = 1;
 	return at;
 }
@@ -965,14 +1017,15 @@ static void grab_common_values(struct atom_value *val, int deref, struct expand_
 
 	for (i = 0; i < used_atom_cnt; i++) {
 		const char *name = used_atom[i].name;
+		enum atom_type atom_type = used_atom[i].atom_type;
 		struct atom_value *v = &val[i];
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
-		if (!strcmp(name, "objecttype"))
+		if (atom_type == ATOM_OBJECTTYPE)
 			v->s = xstrdup(type_name(oi->type));
-		else if (starts_with(name, "objectsize")) {
+		else if (atom_type == ATOM_OBJECTSIZE) {
 			if (used_atom[i].u.objectsize.option == O_SIZE_DISK) {
 				v->value = oi->disk_size;
 				v->s = xstrfmt("%"PRIuMAX, (uintmax_t)oi->disk_size);
@@ -980,9 +1033,9 @@ static void grab_common_values(struct atom_value *val, int deref, struct expand_
 				v->value = oi->size;
 				v->s = xstrfmt("%"PRIuMAX , (uintmax_t)oi->size);
 			}
-		} else if (!strcmp(name, "deltabase"))
+		} else if (atom_type == ATOM_DELTABASE)
 			v->s = xstrdup(oid_to_hex(&oi->delta_base_oid));
-		else if (deref)
+		else if (atom_type == ATOM_OBJECTNAME && deref)
 			grab_oid(name, "objectname", &oi->oid, v, &used_atom[i]);
 	}
 }
@@ -995,16 +1048,17 @@ static void grab_tag_values(struct atom_value *val, int deref, struct object *ob
 
 	for (i = 0; i < used_atom_cnt; i++) {
 		const char *name = used_atom[i].name;
+		enum atom_type atom_type = used_atom[i].atom_type;
 		struct atom_value *v = &val[i];
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
-		if (!strcmp(name, "tag"))
+		if (atom_type == ATOM_TAG)
 			v->s = xstrdup(tag->tag);
-		else if (!strcmp(name, "type") && tag->tagged)
+		else if (atom_type == ATOM_TYPE && tag->tagged)
 			v->s = xstrdup(type_name(tag->tagged->type));
-		else if (!strcmp(name, "object") && tag->tagged)
+		else if (atom_type == ATOM_OBJECT && tag->tagged)
 			v->s = xstrdup(oid_to_hex(&tag->tagged->oid));
 	}
 }
@@ -1017,18 +1071,20 @@ static void grab_commit_values(struct atom_value *val, int deref, struct object
 
 	for (i = 0; i < used_atom_cnt; i++) {
 		const char *name = used_atom[i].name;
+		enum atom_type atom_type = used_atom[i].atom_type;
 		struct atom_value *v = &val[i];
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
-		if (grab_oid(name, "tree", get_commit_tree_oid(commit), v, &used_atom[i]))
+		if (atom_type == ATOM_TREE &&
+		    grab_oid(name, "tree", get_commit_tree_oid(commit), v, &used_atom[i]))
 			continue;
-		if (!strcmp(name, "numparent")) {
+		if (atom_type == ATOM_NUMPARENT) {
 			v->value = commit_list_count(commit->parents);
 			v->s = xstrfmt("%lu", (unsigned long)v->value);
 		}
-		else if (starts_with(name, "parent")) {
+		else if (atom_type == ATOM_PARENT) {
 			struct commit_list *parents;
 			struct strbuf s = STRBUF_INIT;
 			for (parents = commit->parents; parents; parents = parents->next) {
@@ -1208,15 +1264,16 @@ static void grab_person(const char *who, struct atom_value *val, int deref, void
 		return;
 	for (i = 0; i < used_atom_cnt; i++) {
 		const char *name = used_atom[i].name;
+		enum atom_type atom_type = used_atom[i].atom_type;
 		struct atom_value *v = &val[i];
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
 
-		if (starts_with(name, "creatordate"))
+		if (atom_type == ATOM_CREATORDATE)
 			grab_date(wholine, v, name);
-		else if (!strcmp(name, "creator"))
+		else if (atom_type == ATOM_CREATOR)
 			v->s = copy_line(wholine);
 	}
 }
@@ -1696,6 +1753,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 	/* Fill in specials first */
 	for (i = 0; i < used_atom_cnt; i++) {
 		struct used_atom *atom = &used_atom[i];
+		enum atom_type atom_type = atom->atom_type;
 		const char *name = used_atom[i].name;
 		struct atom_value *v = &ref->value[i];
 		int deref = 0;
@@ -1710,18 +1768,18 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 			name++;
 		}
 
-		if (starts_with(name, "refname"))
+		if (atom_type == ATOM_REFNAME)
 			refname = get_refname(atom, ref);
-		else if (!strcmp(name, "worktreepath")) {
+		else if (atom_type == ATOM_WORKTREEPATH) {
 			if (ref->kind == FILTER_REFS_BRANCHES)
 				v->s = get_worktree_path(atom, ref);
 			else
 				v->s = xstrdup("");
 			continue;
 		}
-		else if (starts_with(name, "symref"))
+		else if (atom_type == ATOM_SYMREF)
 			refname = get_symref(atom, ref);
-		else if (starts_with(name, "upstream")) {
+		else if (atom_type == ATOM_UPSTREAM) {
 			const char *branch_name;
 			/* only local branches may have an upstream */
 			if (!skip_prefix(ref->refname, "refs/heads/",
@@ -1737,7 +1795,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 			else
 				v->s = xstrdup("");
 			continue;
-		} else if (atom->u.remote_ref.push) {
+		} else if (atom_type == ATOM_PUSH && atom->u.remote_ref.push) {
 			const char *branch_name;
 			v->s = xstrdup("");
 			if (!skip_prefix(ref->refname, "refs/heads/",
@@ -1756,10 +1814,10 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 			free((char *)v->s);
 			fill_remote_ref_details(atom, refname, branch, &v->s);
 			continue;
-		} else if (starts_with(name, "color:")) {
+		} else if (atom_type == ATOM_COLOR) {
 			v->s = xstrdup(atom->u.color);
 			continue;
-		} else if (!strcmp(name, "flag")) {
+		} else if (atom_type == ATOM_FLAG) {
 			char buf[256], *cp = buf;
 			if (ref->flag & REF_ISSYMREF)
 				cp = copy_advance(cp, ",symref");
@@ -1772,23 +1830,24 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 				v->s = xstrdup(buf + 1);
 			}
 			continue;
-		} else if (!deref && grab_oid(name, "objectname", &ref->objectname, v, atom)) {
-			continue;
-		} else if (!strcmp(name, "HEAD")) {
+		} else if (!deref && atom_type == ATOM_OBJECTNAME &&
+			   grab_oid(name, "objectname", &ref->objectname, v, atom)) {
+				continue;
+		} else if (atom_type == ATOM_HEAD) {
 			if (atom->u.head && !strcmp(ref->refname, atom->u.head))
 				v->s = xstrdup("*");
 			else
 				v->s = xstrdup(" ");
 			continue;
-		} else if (starts_with(name, "align")) {
+		} else if (atom_type == ATOM_ALIGN) {
 			v->handler = align_atom_handler;
 			v->s = xstrdup("");
 			continue;
-		} else if (!strcmp(name, "end")) {
+		} else if (atom_type == ATOM_END) {
 			v->handler = end_atom_handler;
 			v->s = xstrdup("");
 			continue;
-		} else if (starts_with(name, "if")) {
+		} else if (atom_type == ATOM_IF) {
 			const char *s;
 			if (skip_prefix(name, "if:", &s))
 				v->s = xstrdup(s);
@@ -1796,11 +1855,11 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 				v->s = xstrdup("");
 			v->handler = if_atom_handler;
 			continue;
-		} else if (!strcmp(name, "then")) {
+		} else if (atom_type == ATOM_THEN) {
 			v->handler = then_atom_handler;
 			v->s = xstrdup("");
 			continue;
-		} else if (!strcmp(name, "else")) {
+		} else if (atom_type == ATOM_ELSE) {
 			v->handler = else_atom_handler;
 			v->s = xstrdup("");
 			continue;
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 1/2] [GSOC] ref-filter: add objectsize to used_atom
  2021-05-12 12:11     ` [PATCH v3 1/2] [GSOC] ref-filter: add objectsize to used_atom ZheNing Hu via GitGitGadget
@ 2021-05-12 23:11       ` Junio C Hamano
  2021-05-13  9:04         ` ZheNing Hu
  0 siblings, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2021-05-12 23:11 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Jeff King, Hariom Verma, Christian Couder, ZheNing Hu

"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: ZheNing Hu <adlternative@gmail.com>
>
> Since "objectsize:size" is composed of two parts,
> "type:attribute". However, the original implementation

Y is missing in the above "Since X, Y".  Because it is composed of
two parts, what?

Dropping "Since " would make the sentence work better.  Also, I
think it should be "objectsize:disk".

> did not decouple the two parts "type" and "attribute" well,
> we still need to judge separately whether the atom is
> "objectsize" or "objectsize:disk" in `grab_common_values()`.

Perhaps

	When the support for "objectsize:disk" was bolted onto the
	existing support for "objectsize", it didn't follow the
	usual pattern for handling "atomtype:modifier", which reads
	the <modifier> part just once while parsing the format
	string, and store the parsed result in the union in the
	used_atom structure, so that the string form of it does not
	have to be parsed over and over at runtime (e.g. in
	grab_common_values()).

> Add a new member `objectsize` to the union `used_atom.u`,
> so that we can separate the judgment of atom type from the
> judgment of atom attribute, This will bring scalability to
> atom `%(objectsize)`.

OK.

> +		struct {
> +			enum { O_SIZE, O_SIZE_DISK } option;
> +		} objectsize;

OK.

> @@ -967,12 +972,14 @@ static void grab_common_values(struct atom_value *val, int deref, struct expand_
>  			name++;
>  		if (!strcmp(name, "objecttype"))
>  			v->s = xstrdup(type_name(oi->type));
> +		else if (starts_with(name, "objectsize")) {
> +			if (used_atom[i].u.objectsize.option == O_SIZE_DISK) {
> +				v->value = oi->disk_size;
> +				v->s = xstrfmt("%"PRIuMAX, (uintmax_t)oi->disk_size);
> +			} else if (used_atom[i].u.objectsize.option == O_SIZE) {
> +				v->value = oi->size;
> +				v->s = xstrfmt("%"PRIuMAX , (uintmax_t)oi->size);
> +			}

OK.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 2/2] [GSOC] ref-filter: introduce enum atom_type
  2021-05-12 12:11     ` [PATCH v3 2/2] [GSOC] ref-filter: introduce enum atom_type ZheNing Hu via GitGitGadget
@ 2021-05-12 23:21       ` Junio C Hamano
  2021-05-13  9:25         ` ZheNing Hu
  0 siblings, 1 reply; 27+ messages in thread
From: Junio C Hamano @ 2021-05-12 23:21 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget
  Cc: git, Jeff King, Hariom Verma, Christian Couder, ZheNing Hu

"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:

> +/*
> + * The enum atom_type is used as the coordinates of valid_atom entry.

Usually we do not say "coordinates" when we talk about X of an array
element A[X].  "... used as an index in the valid_atom[] array." perhaps.

> + * In the atom parsing stage, it will be passed to used_atom.atom_type
> + * as the identifier of the atom type. We can judge the type of used_atom

You seem to like the verb "judge" (it was also seen in the proposed
log message for 1/2) and tend to overuse it when we use other verbs;
in this particular case, probably the right verb is to "check".

> + * entry by `if (used_atom[i].atom_type == ATOM_*)`.
> + */
> +enum atom_type {
> +	ATOM_REFNAME,
> +...
> +	ATOM_ELSE,
> +};
> +
> @@ -119,6 +169,7 @@ static struct ref_to_worktree_map {
>   * array.
>   */
>  static struct used_atom {
> +	enum atom_type atom_type;
>  	const char *name;
>  	cmp_type type;
>  	info_source source;

OK.

> @@ -506,47 +557,47 @@ static struct {
>  	int (*parser)(const struct ref_format *format, struct used_atom *atom,
>  		      const char *arg, struct strbuf *err);
>  } valid_atom[] = {
> -	{ "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser },
> ...
> -	{ "else", SOURCE_NONE },
> +	[ATOM_REFNAME] = { "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser },
> ...
> +	[ATOM_ELSE] = { "else", SOURCE_NONE },

Makes sense.

> @@ -628,6 +679,7 @@ static int parse_ref_filter_atom(const struct ref_format *format,
>  	at = used_atom_cnt;
>  	used_atom_cnt++;
>  	REALLOC_ARRAY(used_atom, used_atom_cnt);
> +	used_atom[at].atom_type = i;

Makes sense.

>  	used_atom[at].name = xmemdupz(atom, ep - atom);
>  	used_atom[at].type = valid_atom[i].cmp_type;
>  	used_atom[at].source = valid_atom[i].source;
> @@ -652,7 +704,7 @@ static int parse_ref_filter_atom(const struct ref_format *format,
>  		return -1;
>  	if (*atom == '*')
>  		need_tagged = 1;
> -	if (!strcmp(valid_atom[i].name, "symref"))
> +	if (i == ATOM_SYMREF)
>  		need_symref = 1;
>  	return at;
>  }

Nice.

> @@ -965,14 +1017,15 @@ static void grab_common_values(struct atom_value *val, int deref, struct expand_
>  
>  	for (i = 0; i < used_atom_cnt; i++) {
>  		const char *name = used_atom[i].name;
> +		enum atom_type atom_type = used_atom[i].atom_type;
>  		struct atom_value *v = &val[i];
>  		if (!!deref != (*name == '*'))
>  			continue;
>  		if (deref)
>  			name++;
> -		if (!strcmp(name, "objecttype"))
> +		if (atom_type == ATOM_OBJECTTYPE)
>  			v->s = xstrdup(type_name(oi->type));
> -		else if (starts_with(name, "objectsize")) {
> +		else if (atom_type == ATOM_OBJECTSIZE) {
>  			if (used_atom[i].u.objectsize.option == O_SIZE_DISK) {
>  				v->value = oi->disk_size;
>  				v->s = xstrfmt("%"PRIuMAX, (uintmax_t)oi->disk_size);

Replacing !strcmp() with comparison with ATOM_* like the above is
the best solution for this step, but I wonder if this part (or any
other part that this patch touches) would benefit from using a
switch statement on atom_type.  Something to think about in the
future, after the dust settles.

Thanks.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 1/2] [GSOC] ref-filter: add objectsize to used_atom
  2021-05-12 23:11       ` Junio C Hamano
@ 2021-05-13  9:04         ` ZheNing Hu
  0 siblings, 0 replies; 27+ messages in thread
From: ZheNing Hu @ 2021-05-13  9:04 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: ZheNing Hu via GitGitGadget, Git List, Jeff King, Hariom Verma,
	Christian Couder

Junio C Hamano <gitster@pobox.com> 于2021年5月13日周四 上午7:11写道:
>
> "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > Since "objectsize:size" is composed of two parts,
> > "type:attribute". However, the original implementation
>
> Y is missing in the above "Since X, Y".  Because it is composed of
> two parts, what?
>
> Dropping "Since " would make the sentence work better.  Also, I
> think it should be "objectsize:disk".
>

Oh, here is my typo.

> > did not decouple the two parts "type" and "attribute" well,
> > we still need to judge separately whether the atom is
> > "objectsize" or "objectsize:disk" in `grab_common_values()`.
>
> Perhaps
>
>         When the support for "objectsize:disk" was bolted onto the
>         existing support for "objectsize", it didn't follow the
>         usual pattern for handling "atomtype:modifier", which reads
>         the <modifier> part just once while parsing the format
>         string, and store the parsed result in the union in the
>         used_atom structure, so that the string form of it does not
>         have to be parsed over and over at runtime (e.g. in
>         grab_common_values()).
>

That’s great.

Thanks.
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 2/2] [GSOC] ref-filter: introduce enum atom_type
  2021-05-12 23:21       ` Junio C Hamano
@ 2021-05-13  9:25         ` ZheNing Hu
  0 siblings, 0 replies; 27+ messages in thread
From: ZheNing Hu @ 2021-05-13  9:25 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: ZheNing Hu via GitGitGadget, Git List, Jeff King, Hariom Verma,
	Christian Couder

Junio C Hamano <gitster@pobox.com> 于2021年5月13日周四 上午7:21写道:
>
> "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > +/*
> > + * The enum atom_type is used as the coordinates of valid_atom entry.
>
> Usually we do not say "coordinates" when we talk about X of an array
> element A[X].  "... used as an index in the valid_atom[] array." perhaps.
>

Ok, use index instead of coordinates.

> > + * In the atom parsing stage, it will be passed to used_atom.atom_type
> > + * as the identifier of the atom type. We can judge the type of used_atom
>
> You seem to like the verb "judge" (it was also seen in the proposed
> log message for 1/2) and tend to overuse it when we use other verbs;
> in this particular case, probably the right verb is to "check".
>

Yes, "judge" closer to my personal language expression, I will use "check".

> > @@ -965,14 +1017,15 @@ static void grab_common_values(struct atom_value *val, int deref, struct expand_
> >
> >       for (i = 0; i < used_atom_cnt; i++) {
> >               const char *name = used_atom[i].name;
> > +             enum atom_type atom_type = used_atom[i].atom_type;
> >               struct atom_value *v = &val[i];
> >               if (!!deref != (*name == '*'))
> >                       continue;
> >               if (deref)
> >                       name++;
> > -             if (!strcmp(name, "objecttype"))
> > +             if (atom_type == ATOM_OBJECTTYPE)
> >                       v->s = xstrdup(type_name(oi->type));
> > -             else if (starts_with(name, "objectsize")) {
> > +             else if (atom_type == ATOM_OBJECTSIZE) {
> >                       if (used_atom[i].u.objectsize.option == O_SIZE_DISK) {
> >                               v->value = oi->disk_size;
> >                               v->s = xstrfmt("%"PRIuMAX, (uintmax_t)oi->disk_size);
>
> Replacing !strcmp() with comparison with ATOM_* like the above is
> the best solution for this step, but I wonder if this part (or any
> other part that this patch touches) would benefit from using a
> switch statement on atom_type.  Something to think about in the
> future, after the dust settles.
>
> Thanks.

Well, that's right, use switch can increase readability and maintainability.
This can be used as a future direction.

Thanks.
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v4 0/2] [GSOC][RFC] ref-filter: introduce enum atom_type
  2021-05-12 12:11   ` [PATCH v3 0/2] [GSOC][RFC] " ZheNing Hu via GitGitGadget
  2021-05-12 12:11     ` [PATCH v3 1/2] [GSOC] ref-filter: add objectsize to used_atom ZheNing Hu via GitGitGadget
  2021-05-12 12:11     ` [PATCH v3 2/2] [GSOC] ref-filter: introduce enum atom_type ZheNing Hu via GitGitGadget
@ 2021-05-13 15:15     ` ZheNing Hu via GitGitGadget
  2021-05-13 15:15       ` [PATCH v4 1/2] [GSOC] ref-filter: add objectsize to used_atom ZheNing Hu via GitGitGadget
  2021-05-13 15:15       ` [PATCH v4 2/2] [GSOC] ref-filter: introduce enum atom_type ZheNing Hu via GitGitGadget
  2 siblings, 2 replies; 27+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-05-13 15:15 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Hariom Verma, Christian Couder, ZheNing Hu

Change from last version: fix some typo error and change commit message with
Junio's help.

ZheNing Hu (2):
  [GSOC] ref-filter: add objectsize to used_atom
  [GSOC] ref-filter: introduce enum atom_type

 ref-filter.c | 214 +++++++++++++++++++++++++++++++++------------------
 1 file changed, 140 insertions(+), 74 deletions(-)


base-commit: 7e391989789db82983665667013a46eabc6fc570
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-951%2Fadlternative%2Fref-filter-atom-type-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-951/adlternative/ref-filter-atom-type-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/951

Range-diff vs v3:

 1:  91ca57c9d04a ! 1:  23fc04f7eb23 [GSOC] ref-filter: add objectsize to used_atom
     @@ Metadata
       ## Commit message ##
          [GSOC] ref-filter: add objectsize to used_atom
      
     -    Since "objectsize:size" is composed of two parts,
     -    "type:attribute". However, the original implementation
     -    did not decouple the two parts "type" and "attribute" well,
     -    we still need to judge separately whether the atom is
     -    "objectsize" or "objectsize:disk" in `grab_common_values()`.
     +    When the support for "objectsize:disk" was bolted onto the
     +    existing support for "objectsize", it didn't follow the
     +    usual pattern for handling "atomtype:modifier", which reads
     +    the <modifier> part just once while parsing the format
     +    string, and store the parsed result in the union in the
     +    used_atom structure, so that the string form of it does not
     +    have to be parsed over and over at runtime (e.g. in
     +    grab_common_values()).
      
          Add a new member `objectsize` to the union `used_atom.u`,
     -    so that we can separate the judgment of atom type from the
     -    judgment of atom attribute, This will bring scalability to
     -    atom `%(objectsize)`.
     +    so that we can separate the check of <modifier> from the
     +    check of <atomtype>, this will bring scalability to atom
     +    `%(objectsize)`.
      
          Signed-off-by: ZheNing Hu <adlternative@gmail.com>
      
 2:  43400cac58e7 ! 2:  50cfe1f0c6c1 [GSOC] ref-filter: introduce enum atom_type
     @@ Commit message
          `used_atom.atom_type` will record corresponding enum value
          from valid_atom entry index, and then in specific reference
          attribute filling step, only need to compare the value of
     -    the `used_atom[i].atom_type` to judge the atom type.
     +    the `used_atom[i].atom_type` to check the atom type.
      
          Helped-by: Junio C Hamano <gitster@pobox.com>
          Helped-by: Christian Couder <christian.couder@gmail.com>
     @@ ref-filter.c: static struct ref_to_worktree_map {
       } ref_to_worktree_map;
       
      +/*
     -+ * The enum atom_type is used as the coordinates of valid_atom entry.
     ++ * The enum atom_type is used as the index of valid_atom array.
      + * In the atom parsing stage, it will be passed to used_atom.atom_type
     -+ * as the identifier of the atom type. We can judge the type of used_atom
     ++ * as the identifier of the atom type. We can check the type of used_atom
      + * entry by `if (used_atom[i].atom_type == ATOM_*)`.
      + */
      +enum atom_type {

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v4 1/2] [GSOC] ref-filter: add objectsize to used_atom
  2021-05-13 15:15     ` [PATCH v4 0/2] [GSOC][RFC] " ZheNing Hu via GitGitGadget
@ 2021-05-13 15:15       ` ZheNing Hu via GitGitGadget
  2021-05-13 15:15       ` [PATCH v4 2/2] [GSOC] ref-filter: introduce enum atom_type ZheNing Hu via GitGitGadget
  1 sibling, 0 replies; 27+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-05-13 15:15 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Hariom Verma, Christian Couder,
	ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

When the support for "objectsize:disk" was bolted onto the
existing support for "objectsize", it didn't follow the
usual pattern for handling "atomtype:modifier", which reads
the <modifier> part just once while parsing the format
string, and store the parsed result in the union in the
used_atom structure, so that the string form of it does not
have to be parsed over and over at runtime (e.g. in
grab_common_values()).

Add a new member `objectsize` to the union `used_atom.u`,
so that we can separate the check of <modifier> from the
check of <atomtype>, this will bring scalability to atom
`%(objectsize)`.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 ref-filter.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index a0adb4551d87..f420bae6e5ba 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -146,6 +146,9 @@ static struct used_atom {
 			enum { O_FULL, O_LENGTH, O_SHORT } option;
 			unsigned int length;
 		} oid;
+		struct {
+			enum { O_SIZE, O_SIZE_DISK } option;
+		} objectsize;
 		struct email_option {
 			enum { EO_RAW, EO_TRIM, EO_LOCALPART } option;
 		} email_option;
@@ -269,11 +272,13 @@ static int objectsize_atom_parser(const struct ref_format *format, struct used_a
 				  const char *arg, struct strbuf *err)
 {
 	if (!arg) {
+		atom->u.objectsize.option = O_SIZE;
 		if (*atom->name == '*')
 			oi_deref.info.sizep = &oi_deref.size;
 		else
 			oi.info.sizep = &oi.size;
 	} else if (!strcmp(arg, "disk")) {
+		atom->u.objectsize.option = O_SIZE_DISK;
 		if (*atom->name == '*')
 			oi_deref.info.disk_sizep = &oi_deref.disk_size;
 		else
@@ -967,12 +972,14 @@ static void grab_common_values(struct atom_value *val, int deref, struct expand_
 			name++;
 		if (!strcmp(name, "objecttype"))
 			v->s = xstrdup(type_name(oi->type));
-		else if (!strcmp(name, "objectsize:disk")) {
-			v->value = oi->disk_size;
-			v->s = xstrfmt("%"PRIuMAX, (uintmax_t)oi->disk_size);
-		} else if (!strcmp(name, "objectsize")) {
-			v->value = oi->size;
-			v->s = xstrfmt("%"PRIuMAX , (uintmax_t)oi->size);
+		else if (starts_with(name, "objectsize")) {
+			if (used_atom[i].u.objectsize.option == O_SIZE_DISK) {
+				v->value = oi->disk_size;
+				v->s = xstrfmt("%"PRIuMAX, (uintmax_t)oi->disk_size);
+			} else if (used_atom[i].u.objectsize.option == O_SIZE) {
+				v->value = oi->size;
+				v->s = xstrfmt("%"PRIuMAX , (uintmax_t)oi->size);
+			}
 		} else if (!strcmp(name, "deltabase"))
 			v->s = xstrdup(oid_to_hex(&oi->delta_base_oid));
 		else if (deref)
-- 
gitgitgadget


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v4 2/2] [GSOC] ref-filter: introduce enum atom_type
  2021-05-13 15:15     ` [PATCH v4 0/2] [GSOC][RFC] " ZheNing Hu via GitGitGadget
  2021-05-13 15:15       ` [PATCH v4 1/2] [GSOC] ref-filter: add objectsize to used_atom ZheNing Hu via GitGitGadget
@ 2021-05-13 15:15       ` ZheNing Hu via GitGitGadget
  1 sibling, 0 replies; 27+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-05-13 15:15 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Hariom Verma, Christian Couder,
	ZheNing Hu, ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

In the original ref-filter design, it will copy the parsed
atom's name and attributes to `used_atom[i].name` in the
atom's parsing step, and use it again for string matching
in the later specific ref attributes filling step. It use
a lot of string matching to determine which atom we need.

Introduce the enum "atom_type", each enum value is named
as `ATOM_*`, which is the index of each corresponding
valid_atom entry. In the first step of the atom parsing,
`used_atom.atom_type` will record corresponding enum value
from valid_atom entry index, and then in specific reference
attribute filling step, only need to compare the value of
the `used_atom[i].atom_type` to check the atom type.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Christian Couder <christian.couder@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 ref-filter.c | 197 +++++++++++++++++++++++++++++++++------------------
 1 file changed, 128 insertions(+), 69 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index f420bae6e5ba..e7a48d428547 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -108,6 +108,56 @@ static struct ref_to_worktree_map {
 	struct worktree **worktrees;
 } ref_to_worktree_map;
 
+/*
+ * The enum atom_type is used as the index of valid_atom array.
+ * In the atom parsing stage, it will be passed to used_atom.atom_type
+ * as the identifier of the atom type. We can check the type of used_atom
+ * entry by `if (used_atom[i].atom_type == ATOM_*)`.
+ */
+enum atom_type {
+	ATOM_REFNAME,
+	ATOM_OBJECTTYPE,
+	ATOM_OBJECTSIZE,
+	ATOM_OBJECTNAME,
+	ATOM_DELTABASE,
+	ATOM_TREE,
+	ATOM_PARENT,
+	ATOM_NUMPARENT,
+	ATOM_OBJECT,
+	ATOM_TYPE,
+	ATOM_TAG,
+	ATOM_AUTHOR,
+	ATOM_AUTHORNAME,
+	ATOM_AUTHOREMAIL,
+	ATOM_AUTHORDATE,
+	ATOM_COMMITTER,
+	ATOM_COMMITTERNAME,
+	ATOM_COMMITTEREMAIL,
+	ATOM_COMMITTERDATE,
+	ATOM_TAGGER,
+	ATOM_TAGGERNAME,
+	ATOM_TAGGEREMAIL,
+	ATOM_TAGGERDATE,
+	ATOM_CREATOR,
+	ATOM_CREATORDATE,
+	ATOM_SUBJECT,
+	ATOM_BODY,
+	ATOM_TRAILERS,
+	ATOM_CONTENTS,
+	ATOM_UPSTREAM,
+	ATOM_PUSH,
+	ATOM_SYMREF,
+	ATOM_FLAG,
+	ATOM_HEAD,
+	ATOM_COLOR,
+	ATOM_WORKTREEPATH,
+	ATOM_ALIGN,
+	ATOM_END,
+	ATOM_IF,
+	ATOM_THEN,
+	ATOM_ELSE,
+};
+
 /*
  * An atom is a valid field atom listed below, possibly prefixed with
  * a "*" to denote deref_tag().
@@ -119,6 +169,7 @@ static struct ref_to_worktree_map {
  * array.
  */
 static struct used_atom {
+	enum atom_type atom_type;
 	const char *name;
 	cmp_type type;
 	info_source source;
@@ -506,47 +557,47 @@ static struct {
 	int (*parser)(const struct ref_format *format, struct used_atom *atom,
 		      const char *arg, struct strbuf *err);
 } valid_atom[] = {
-	{ "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser },
-	{ "objecttype", SOURCE_OTHER, FIELD_STR, objecttype_atom_parser },
-	{ "objectsize", SOURCE_OTHER, FIELD_ULONG, objectsize_atom_parser },
-	{ "objectname", SOURCE_OTHER, FIELD_STR, oid_atom_parser },
-	{ "deltabase", SOURCE_OTHER, FIELD_STR, deltabase_atom_parser },
-	{ "tree", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
-	{ "parent", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
-	{ "numparent", SOURCE_OBJ, FIELD_ULONG },
-	{ "object", SOURCE_OBJ },
-	{ "type", SOURCE_OBJ },
-	{ "tag", SOURCE_OBJ },
-	{ "author", SOURCE_OBJ },
-	{ "authorname", SOURCE_OBJ },
-	{ "authoremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
-	{ "authordate", SOURCE_OBJ, FIELD_TIME },
-	{ "committer", SOURCE_OBJ },
-	{ "committername", SOURCE_OBJ },
-	{ "committeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
-	{ "committerdate", SOURCE_OBJ, FIELD_TIME },
-	{ "tagger", SOURCE_OBJ },
-	{ "taggername", SOURCE_OBJ },
-	{ "taggeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
-	{ "taggerdate", SOURCE_OBJ, FIELD_TIME },
-	{ "creator", SOURCE_OBJ },
-	{ "creatordate", SOURCE_OBJ, FIELD_TIME },
-	{ "subject", SOURCE_OBJ, FIELD_STR, subject_atom_parser },
-	{ "body", SOURCE_OBJ, FIELD_STR, body_atom_parser },
-	{ "trailers", SOURCE_OBJ, FIELD_STR, trailers_atom_parser },
-	{ "contents", SOURCE_OBJ, FIELD_STR, contents_atom_parser },
-	{ "upstream", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
-	{ "push", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
-	{ "symref", SOURCE_NONE, FIELD_STR, refname_atom_parser },
-	{ "flag", SOURCE_NONE },
-	{ "HEAD", SOURCE_NONE, FIELD_STR, head_atom_parser },
-	{ "color", SOURCE_NONE, FIELD_STR, color_atom_parser },
-	{ "worktreepath", SOURCE_NONE },
-	{ "align", SOURCE_NONE, FIELD_STR, align_atom_parser },
-	{ "end", SOURCE_NONE },
-	{ "if", SOURCE_NONE, FIELD_STR, if_atom_parser },
-	{ "then", SOURCE_NONE },
-	{ "else", SOURCE_NONE },
+	[ATOM_REFNAME] = { "refname", SOURCE_NONE, FIELD_STR, refname_atom_parser },
+	[ATOM_OBJECTTYPE] = { "objecttype", SOURCE_OTHER, FIELD_STR, objecttype_atom_parser },
+	[ATOM_OBJECTSIZE] = { "objectsize", SOURCE_OTHER, FIELD_ULONG, objectsize_atom_parser },
+	[ATOM_OBJECTNAME] = { "objectname", SOURCE_OTHER, FIELD_STR, oid_atom_parser },
+	[ATOM_DELTABASE] = { "deltabase", SOURCE_OTHER, FIELD_STR, deltabase_atom_parser },
+	[ATOM_TREE] = { "tree", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
+	[ATOM_PARENT] = { "parent", SOURCE_OBJ, FIELD_STR, oid_atom_parser },
+	[ATOM_NUMPARENT] = { "numparent", SOURCE_OBJ, FIELD_ULONG },
+	[ATOM_OBJECT] = { "object", SOURCE_OBJ },
+	[ATOM_TYPE] = { "type", SOURCE_OBJ },
+	[ATOM_TAG] = { "tag", SOURCE_OBJ },
+	[ATOM_AUTHOR] = { "author", SOURCE_OBJ },
+	[ATOM_AUTHORNAME] = { "authorname", SOURCE_OBJ },
+	[ATOM_AUTHOREMAIL] = { "authoremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
+	[ATOM_AUTHORDATE] = { "authordate", SOURCE_OBJ, FIELD_TIME },
+	[ATOM_COMMITTER] = { "committer", SOURCE_OBJ },
+	[ATOM_COMMITTERNAME] = { "committername", SOURCE_OBJ },
+	[ATOM_COMMITTEREMAIL] = { "committeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
+	[ATOM_COMMITTERDATE] = { "committerdate", SOURCE_OBJ, FIELD_TIME },
+	[ATOM_TAGGER] = { "tagger", SOURCE_OBJ },
+	[ATOM_TAGGERNAME] = { "taggername", SOURCE_OBJ },
+	[ATOM_TAGGEREMAIL] = { "taggeremail", SOURCE_OBJ, FIELD_STR, person_email_atom_parser },
+	[ATOM_TAGGERDATE] = { "taggerdate", SOURCE_OBJ, FIELD_TIME },
+	[ATOM_CREATOR] = { "creator", SOURCE_OBJ },
+	[ATOM_CREATORDATE] = { "creatordate", SOURCE_OBJ, FIELD_TIME },
+	[ATOM_SUBJECT] = { "subject", SOURCE_OBJ, FIELD_STR, subject_atom_parser },
+	[ATOM_BODY] = { "body", SOURCE_OBJ, FIELD_STR, body_atom_parser },
+	[ATOM_TRAILERS] = { "trailers", SOURCE_OBJ, FIELD_STR, trailers_atom_parser },
+	[ATOM_CONTENTS] = { "contents", SOURCE_OBJ, FIELD_STR, contents_atom_parser },
+	[ATOM_UPSTREAM] = { "upstream", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
+	[ATOM_PUSH] = { "push", SOURCE_NONE, FIELD_STR, remote_ref_atom_parser },
+	[ATOM_SYMREF] = { "symref", SOURCE_NONE, FIELD_STR, refname_atom_parser },
+	[ATOM_FLAG] = { "flag", SOURCE_NONE },
+	[ATOM_HEAD] = { "HEAD", SOURCE_NONE, FIELD_STR, head_atom_parser },
+	[ATOM_COLOR] = { "color", SOURCE_NONE, FIELD_STR, color_atom_parser },
+	[ATOM_WORKTREEPATH] = { "worktreepath", SOURCE_NONE },
+	[ATOM_ALIGN] = { "align", SOURCE_NONE, FIELD_STR, align_atom_parser },
+	[ATOM_END] = { "end", SOURCE_NONE },
+	[ATOM_IF] = { "if", SOURCE_NONE, FIELD_STR, if_atom_parser },
+	[ATOM_THEN] = { "then", SOURCE_NONE },
+	[ATOM_ELSE] = { "else", SOURCE_NONE },
 	/*
 	 * Please update $__git_ref_fieldlist in git-completion.bash
 	 * when you add new atoms
@@ -628,6 +679,7 @@ static int parse_ref_filter_atom(const struct ref_format *format,
 	at = used_atom_cnt;
 	used_atom_cnt++;
 	REALLOC_ARRAY(used_atom, used_atom_cnt);
+	used_atom[at].atom_type = i;
 	used_atom[at].name = xmemdupz(atom, ep - atom);
 	used_atom[at].type = valid_atom[i].cmp_type;
 	used_atom[at].source = valid_atom[i].source;
@@ -652,7 +704,7 @@ static int parse_ref_filter_atom(const struct ref_format *format,
 		return -1;
 	if (*atom == '*')
 		need_tagged = 1;
-	if (!strcmp(valid_atom[i].name, "symref"))
+	if (i == ATOM_SYMREF)
 		need_symref = 1;
 	return at;
 }
@@ -965,14 +1017,15 @@ static void grab_common_values(struct atom_value *val, int deref, struct expand_
 
 	for (i = 0; i < used_atom_cnt; i++) {
 		const char *name = used_atom[i].name;
+		enum atom_type atom_type = used_atom[i].atom_type;
 		struct atom_value *v = &val[i];
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
-		if (!strcmp(name, "objecttype"))
+		if (atom_type == ATOM_OBJECTTYPE)
 			v->s = xstrdup(type_name(oi->type));
-		else if (starts_with(name, "objectsize")) {
+		else if (atom_type == ATOM_OBJECTSIZE) {
 			if (used_atom[i].u.objectsize.option == O_SIZE_DISK) {
 				v->value = oi->disk_size;
 				v->s = xstrfmt("%"PRIuMAX, (uintmax_t)oi->disk_size);
@@ -980,9 +1033,9 @@ static void grab_common_values(struct atom_value *val, int deref, struct expand_
 				v->value = oi->size;
 				v->s = xstrfmt("%"PRIuMAX , (uintmax_t)oi->size);
 			}
-		} else if (!strcmp(name, "deltabase"))
+		} else if (atom_type == ATOM_DELTABASE)
 			v->s = xstrdup(oid_to_hex(&oi->delta_base_oid));
-		else if (deref)
+		else if (atom_type == ATOM_OBJECTNAME && deref)
 			grab_oid(name, "objectname", &oi->oid, v, &used_atom[i]);
 	}
 }
@@ -995,16 +1048,17 @@ static void grab_tag_values(struct atom_value *val, int deref, struct object *ob
 
 	for (i = 0; i < used_atom_cnt; i++) {
 		const char *name = used_atom[i].name;
+		enum atom_type atom_type = used_atom[i].atom_type;
 		struct atom_value *v = &val[i];
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
-		if (!strcmp(name, "tag"))
+		if (atom_type == ATOM_TAG)
 			v->s = xstrdup(tag->tag);
-		else if (!strcmp(name, "type") && tag->tagged)
+		else if (atom_type == ATOM_TYPE && tag->tagged)
 			v->s = xstrdup(type_name(tag->tagged->type));
-		else if (!strcmp(name, "object") && tag->tagged)
+		else if (atom_type == ATOM_OBJECT && tag->tagged)
 			v->s = xstrdup(oid_to_hex(&tag->tagged->oid));
 	}
 }
@@ -1017,18 +1071,20 @@ static void grab_commit_values(struct atom_value *val, int deref, struct object
 
 	for (i = 0; i < used_atom_cnt; i++) {
 		const char *name = used_atom[i].name;
+		enum atom_type atom_type = used_atom[i].atom_type;
 		struct atom_value *v = &val[i];
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
-		if (grab_oid(name, "tree", get_commit_tree_oid(commit), v, &used_atom[i]))
+		if (atom_type == ATOM_TREE &&
+		    grab_oid(name, "tree", get_commit_tree_oid(commit), v, &used_atom[i]))
 			continue;
-		if (!strcmp(name, "numparent")) {
+		if (atom_type == ATOM_NUMPARENT) {
 			v->value = commit_list_count(commit->parents);
 			v->s = xstrfmt("%lu", (unsigned long)v->value);
 		}
-		else if (starts_with(name, "parent")) {
+		else if (atom_type == ATOM_PARENT) {
 			struct commit_list *parents;
 			struct strbuf s = STRBUF_INIT;
 			for (parents = commit->parents; parents; parents = parents->next) {
@@ -1208,15 +1264,16 @@ static void grab_person(const char *who, struct atom_value *val, int deref, void
 		return;
 	for (i = 0; i < used_atom_cnt; i++) {
 		const char *name = used_atom[i].name;
+		enum atom_type atom_type = used_atom[i].atom_type;
 		struct atom_value *v = &val[i];
 		if (!!deref != (*name == '*'))
 			continue;
 		if (deref)
 			name++;
 
-		if (starts_with(name, "creatordate"))
+		if (atom_type == ATOM_CREATORDATE)
 			grab_date(wholine, v, name);
-		else if (!strcmp(name, "creator"))
+		else if (atom_type == ATOM_CREATOR)
 			v->s = copy_line(wholine);
 	}
 }
@@ -1696,6 +1753,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 	/* Fill in specials first */
 	for (i = 0; i < used_atom_cnt; i++) {
 		struct used_atom *atom = &used_atom[i];
+		enum atom_type atom_type = atom->atom_type;
 		const char *name = used_atom[i].name;
 		struct atom_value *v = &ref->value[i];
 		int deref = 0;
@@ -1710,18 +1768,18 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 			name++;
 		}
 
-		if (starts_with(name, "refname"))
+		if (atom_type == ATOM_REFNAME)
 			refname = get_refname(atom, ref);
-		else if (!strcmp(name, "worktreepath")) {
+		else if (atom_type == ATOM_WORKTREEPATH) {
 			if (ref->kind == FILTER_REFS_BRANCHES)
 				v->s = get_worktree_path(atom, ref);
 			else
 				v->s = xstrdup("");
 			continue;
 		}
-		else if (starts_with(name, "symref"))
+		else if (atom_type == ATOM_SYMREF)
 			refname = get_symref(atom, ref);
-		else if (starts_with(name, "upstream")) {
+		else if (atom_type == ATOM_UPSTREAM) {
 			const char *branch_name;
 			/* only local branches may have an upstream */
 			if (!skip_prefix(ref->refname, "refs/heads/",
@@ -1737,7 +1795,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 			else
 				v->s = xstrdup("");
 			continue;
-		} else if (atom->u.remote_ref.push) {
+		} else if (atom_type == ATOM_PUSH && atom->u.remote_ref.push) {
 			const char *branch_name;
 			v->s = xstrdup("");
 			if (!skip_prefix(ref->refname, "refs/heads/",
@@ -1756,10 +1814,10 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 			free((char *)v->s);
 			fill_remote_ref_details(atom, refname, branch, &v->s);
 			continue;
-		} else if (starts_with(name, "color:")) {
+		} else if (atom_type == ATOM_COLOR) {
 			v->s = xstrdup(atom->u.color);
 			continue;
-		} else if (!strcmp(name, "flag")) {
+		} else if (atom_type == ATOM_FLAG) {
 			char buf[256], *cp = buf;
 			if (ref->flag & REF_ISSYMREF)
 				cp = copy_advance(cp, ",symref");
@@ -1772,23 +1830,24 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 				v->s = xstrdup(buf + 1);
 			}
 			continue;
-		} else if (!deref && grab_oid(name, "objectname", &ref->objectname, v, atom)) {
-			continue;
-		} else if (!strcmp(name, "HEAD")) {
+		} else if (!deref && atom_type == ATOM_OBJECTNAME &&
+			   grab_oid(name, "objectname", &ref->objectname, v, atom)) {
+				continue;
+		} else if (atom_type == ATOM_HEAD) {
 			if (atom->u.head && !strcmp(ref->refname, atom->u.head))
 				v->s = xstrdup("*");
 			else
 				v->s = xstrdup(" ");
 			continue;
-		} else if (starts_with(name, "align")) {
+		} else if (atom_type == ATOM_ALIGN) {
 			v->handler = align_atom_handler;
 			v->s = xstrdup("");
 			continue;
-		} else if (!strcmp(name, "end")) {
+		} else if (atom_type == ATOM_END) {
 			v->handler = end_atom_handler;
 			v->s = xstrdup("");
 			continue;
-		} else if (starts_with(name, "if")) {
+		} else if (atom_type == ATOM_IF) {
 			const char *s;
 			if (skip_prefix(name, "if:", &s))
 				v->s = xstrdup(s);
@@ -1796,11 +1855,11 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 				v->s = xstrdup("");
 			v->handler = if_atom_handler;
 			continue;
-		} else if (!strcmp(name, "then")) {
+		} else if (atom_type == ATOM_THEN) {
 			v->handler = then_atom_handler;
 			v->s = xstrdup("");
 			continue;
-		} else if (!strcmp(name, "else")) {
+		} else if (atom_type == ATOM_ELSE) {
 			v->handler = else_atom_handler;
 			v->s = xstrdup("");
 			continue;
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2021-05-13 15:15 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-08 15:22 [PATCH 0/2] [GSOC][RFC] ref-filter: introduce enum atom_type ZheNing Hu via GitGitGadget
2021-05-08 15:22 ` [PATCH 1/2] [GSOC] ref-filter: add objectsize to used_atom ZheNing Hu via GitGitGadget
2021-05-08 15:22 ` [PATCH 2/2] [GSOC][RFC] ref-filter: introduce enum atom_type ZheNing Hu via GitGitGadget
2021-05-09  6:21   ` Christian Couder
2021-05-09  8:26     ` Junio C Hamano
2021-05-09 13:44       ` ZheNing Hu
2021-05-09 13:40     ` ZheNing Hu
2021-05-10 15:03 ` [PATCH v2 0/2] " ZheNing Hu via GitGitGadget
2021-05-10 15:03   ` [PATCH v2 1/2] [GSOC] ref-filter: add objectsize to used_atom ZheNing Hu via GitGitGadget
2021-05-10 15:03   ` [PATCH v2 2/2] [GSOC] ref-filter: introduce enum atom_type ZheNing Hu via GitGitGadget
2021-05-11  2:14     ` Junio C Hamano
2021-05-11  5:51       ` Christian Couder
2021-05-11  6:12         ` Junio C Hamano
2021-05-11 12:53           ` ZheNing Hu
2021-05-11 12:37         ` ZheNing Hu
2021-05-11 13:05         ` Junio C Hamano
2021-05-11 12:18       ` ZheNing Hu
2021-05-12 12:11   ` [PATCH v3 0/2] [GSOC][RFC] " ZheNing Hu via GitGitGadget
2021-05-12 12:11     ` [PATCH v3 1/2] [GSOC] ref-filter: add objectsize to used_atom ZheNing Hu via GitGitGadget
2021-05-12 23:11       ` Junio C Hamano
2021-05-13  9:04         ` ZheNing Hu
2021-05-12 12:11     ` [PATCH v3 2/2] [GSOC] ref-filter: introduce enum atom_type ZheNing Hu via GitGitGadget
2021-05-12 23:21       ` Junio C Hamano
2021-05-13  9:25         ` ZheNing Hu
2021-05-13 15:15     ` [PATCH v4 0/2] [GSOC][RFC] " ZheNing Hu via GitGitGadget
2021-05-13 15:15       ` [PATCH v4 1/2] [GSOC] ref-filter: add objectsize to used_atom ZheNing Hu via GitGitGadget
2021-05-13 15:15       ` [PATCH v4 2/2] [GSOC] ref-filter: introduce enum atom_type ZheNing Hu via GitGitGadget

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).