From: "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>,
Christian Couder <christian.couder@gmail.com>,
Hariom Verma <hariom18599@gmail.com>,
Karthik Nayak <karthik.188@gmail.com>,
Felipe Contreras <felipe.contreras@gmail.com>,
Bagas Sanjaya <bagasdotme@gmail.com>, Jeff King <peff@peff.net>,
Phillip Wood <phillip.wood123@gmail.com>,
ZheNing Hu <adlternative@gmail.com>
Subject: [PATCH v2 0/2] [GSOC] ref-filter: add %(raw) atom
Date: Sun, 30 May 2021 13:01:56 +0000 [thread overview]
Message-ID: <pull.963.v2.git.1622379718.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.963.git.1622126603.gitgitgadget@gmail.com>
In order to make git cat-file --batch use ref-filter logic, I add %(raw)
atom to ref-filter.
Change from last version:
1. Use more elegant memcasecmp().
2. Allow %(raw:size) used with --<lang>.
3. Remove redundant BUG() in then_atom_handler().
4. Roll back to origin function name grab_sub_body_contents().
5. Split the check of object type in grab_sub_body_contents() into the
previous patch.
ZheNing Hu (2):
[GSOC] ref-filter: add obj-type check in grab contents
[GSOC] ref-filter: add %(raw) atom
Documentation/git-for-each-ref.txt | 14 ++
ref-filter.c | 158 ++++++++++++++++++-----
t/t6300-for-each-ref.sh | 200 +++++++++++++++++++++++++++++
3 files changed, 338 insertions(+), 34 deletions(-)
base-commit: 5d5b1473453400224ebb126bf3947e0a3276bdf5
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-963%2Fadlternative%2Fref-filter-raw-atom-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-963/adlternative/ref-filter-raw-atom-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/963
Range-diff vs v1:
-: ------------ > 1: e6c26d19a3f3 [GSOC] ref-filter: add obj-type check in grab contents
1: b3848f24f2d3 ! 2: e44a2ed0db59 [GSOC] ref-filter: add %(raw) atom
@@ Commit message
The raw data of blob, tree objects may contain '\0', but most of
the logic in `ref-filter` depands on the output of the atom being
- a structured string (end with '\0').
+ text (specifically, no embedded NULs in it).
E.g. `quote_formatting()` use `strbuf_addstr()` or `*._quote_buf()`
add the data to the buffer. The raw data of a tree object is
@@ Commit message
can record raw object size, it can help us add raw object data to
the buffer or compare two buffers which contain raw object data.
- Beyond, `--format=%(raw)` should not combine with `--python`, `--shell`,
+ Beyond, `--format=%(raw)` cannot be used with `--python`, `--shell`,
`--tcl`, `--perl` because if our binary raw data is passed to a variable
in the host language, the host languages may cause escape errors.
+ Helped-by: Felipe Contreras <felipe.contreras@gmail.com>
+ Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
+ Helped-by: Junio C Hamano <gitster@pobox.com>
Based-on-patch-by: Olga Telezhnaya <olyatelezhnaya@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
@@ Documentation/git-for-each-ref.txt: and `date` to extract the named component.
+raw:size::
+ The raw data size of the object.
+
-+Note that `--format=%(raw)` should not combine with `--python`, `--shell`, `--tcl`,
++Note that `--format=%(raw)` can not be used with `--python`, `--shell`, `--tcl`,
+`--perl` because if our binary raw data is passed to a variable in the host language,
+the host languages may cause escape errors.
+
@@ ref-filter.c: static int contents_atom_parser(const struct ref_format *format, s
+static int raw_atom_parser(const struct ref_format *format, struct used_atom *atom,
+ const char *arg, struct strbuf *err)
+{
-+ if (!arg) {
++ if (!arg)
+ atom->u.raw_data.option = RAW_BARE;
-+ } else if (!strcmp(arg, "size"))
++ else if (!strcmp(arg, "size"))
+ atom->u.raw_data.option = RAW_LENGTH;
+ else
+ return strbuf_addf_ret(err, -1, _("unrecognized %%(raw) argument: %s"), arg);
@@ ref-filter.c: static int parse_ref_filter_atom(const struct ref_format *format,
return strbuf_addf_ret(err, -1, _("malformed field name: %.*s"),
(int)(ep-atom), atom);
-+ if (format->quote_style && starts_with(sp, "raw"))
-+ return strbuf_addf_ret(err, -1, _("--format=%.*s should not combine with"
+- /* Do we have the atom already used elsewhere? */
+- for (i = 0; i < used_atom_cnt; i++) {
+- int len = strlen(used_atom[i].name);
+- if (len == ep - atom && !memcmp(used_atom[i].name, atom, len))
+- return i;
+- }
+-
+ /*
+ * If the atom name has a colon, strip it and everything after
+ * it off - it specifies the format for this entry, and
+@@ ref-filter.c: static int parse_ref_filter_atom(const struct ref_format *format,
+ arg = memchr(sp, ':', ep - sp);
+ atom_len = (arg ? arg : ep) - sp;
+
++ if (format->quote_style && !strncmp(sp, "raw", 3) && !arg)
++ return strbuf_addf_ret(err, -1, _("--format=%.*s cannot be used with"
+ "--python, --shell, --tcl, --perl"), (int)(ep-atom), atom);
+
- /* Do we have the atom already used elsewhere? */
- for (i = 0; i < used_atom_cnt; i++) {
- int len = strlen(used_atom[i].name);
++ /* Do we have the atom already used elsewhere? */
++ for (i = 0; i < used_atom_cnt; i++) {
++ int len = strlen(used_atom[i].name);
++ if (len == ep - atom && !memcmp(used_atom[i].name, atom, len))
++ return i;
++ }
++
+ /* Is the atom a valid one? */
+ for (i = 0; i < ARRAY_SIZE(valid_atom); i++) {
+ int len = strlen(valid_atom[i].name);
@@ ref-filter.c: static int parse_ref_filter_atom(const struct ref_format *format,
return at;
}
@@ ref-filter.c: static int then_atom_handler(struct atom_value *atomv, struct ref_
*/
if (if_then_else->cmp_status == COMPARE_EQUAL) {
- if (!strcmp(if_then_else->str, cur->output.buf))
-+ if (!if_then_else->str)
-+ BUG("when if_then_else->cmp_status == COMPARE_EQUAL,"
-+ "if_then_else->str must not be null");
+ if (str_len == cur->output.len &&
+ !memcmp(if_then_else->str, cur->output.buf, cur->output.len))
if_then_else->condition_satisfied = 1;
} else if (if_then_else->cmp_status == COMPARE_UNEQUAL) {
- if (strcmp(if_then_else->str, cur->output.buf))
-+ if (!if_then_else->str)
-+ BUG("when if_then_else->cmp_status == COMPARE_UNEQUAL,"
-+ "if_then_else->str must not be null");
+ if (str_len != cur->output.len ||
+ memcmp(if_then_else->str, cur->output.buf, cur->output.len))
if_then_else->condition_satisfied = 1;
@@ ref-filter.c: static int end_atom_handler(struct atom_value *atomv, struct ref_f
}
strbuf_release(&s);
@@ ref-filter.c: static void append_lines(struct strbuf *out, const char *buf, unsigned long size
- }
/* See grab_values */
--static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
-+static void grab_raw_data(struct atom_value *val, int deref, void *buf, unsigned long buf_size, struct object *obj)
+ static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf,
+- struct object *obj)
++ unsigned long buf_size, struct object *obj)
{
int i;
const char *subpos = NULL, *bodypos = NULL, *sigpos = NULL;
-@@ ref-filter.c: static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
- continue;
+@@ ref-filter.c: static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf,
if (deref)
name++;
-- if (strcmp(name, "body") &&
-- !starts_with(name, "subject") &&
-- !starts_with(name, "trailers") &&
-- !starts_with(name, "contents"))
-+
+
+ if (starts_with(name, "raw")) {
+ if (atom->u.raw_data.option == RAW_BARE) {
+ v->s = xmemdupz(buf, buf_size);
+ v->s_size = buf_size;
-+ } else if (atom->u.raw_data.option == RAW_LENGTH)
++ } else if (atom->u.raw_data.option == RAW_LENGTH) {
+ v->s = xstrfmt("%"PRIuMAX, (uintmax_t)buf_size);
++ }
+ continue;
+ }
+
-+ if ((obj->type != OBJ_TAG &&
-+ obj->type != OBJ_COMMIT) ||
-+ (strcmp(name, "body") &&
-+ !starts_with(name, "subject") &&
-+ !starts_with(name, "trailers") &&
-+ !starts_with(name, "contents")))
- continue;
- if (!subpos)
- find_subpos(buf,
+ if ((obj->type != OBJ_TAG &&
+ obj->type != OBJ_COMMIT) ||
+ (strcmp(name, "body") &&
@@ ref-filter.c: static void fill_missing_values(struct atom_value *val)
* pointed at by the ref itself; otherwise it is the object the
* ref (which is a tag) refers to.
@@ ref-filter.c: static void fill_missing_values(struct atom_value *val)
switch (obj->type) {
case OBJ_TAG:
grab_tag_values(val, deref, obj);
-- grab_sub_body_contents(val, deref, buf);
-+ grab_raw_data(val, deref, buf, buf_size, obj);
+- grab_sub_body_contents(val, deref, buf, obj);
++ grab_sub_body_contents(val, deref, buf, buf_size, obj);
grab_person("tagger", val, deref, buf);
break;
case OBJ_COMMIT:
grab_commit_values(val, deref, obj);
-- grab_sub_body_contents(val, deref, buf);
-+ grab_raw_data(val, deref, buf, buf_size, obj);
+- grab_sub_body_contents(val, deref, buf, obj);
++ grab_sub_body_contents(val, deref, buf, buf_size, obj);
grab_person("author", val, deref, buf);
grab_person("committer", val, deref, buf);
break;
case OBJ_TREE:
/* grab_tree_values(val, deref, obj, buf, sz); */
-+ grab_raw_data(val, deref, buf, buf_size, obj);
++ grab_sub_body_contents(val, deref, buf, buf_size, obj);
break;
case OBJ_BLOB:
/* grab_blob_values(val, deref, obj, buf, sz); */
-+ grab_raw_data(val, deref, buf, buf_size, obj);
++ grab_sub_body_contents(val, deref, buf, buf_size, obj);
break;
default:
die("Eh? Object of type %d?", obj->type);
@@ ref-filter.c: static int compare_detached_head(struct ref_array_item *a, struct
+static int memcasecmp(const void *vs1, const void *vs2, size_t n)
+{
-+ size_t i;
-+ const char *s1 = (const char *)vs1;
-+ const char *s2 = (const char *)vs2;
++ const char *s1 = (const void *)vs1;
++ const char *s2 = (const void *)vs2;
++ const char *end = s1 + n;
+
-+ for (i = 0; i < n; i++) {
-+ unsigned char u1 = s1[i];
-+ unsigned char u2 = s2[i];
-+ int U1 = toupper (u1);
-+ int U2 = toupper (u2);
-+ int diff = (UCHAR_MAX <= INT_MAX ? U1 - U2
-+ : U1 < U2 ? -1 : U2 < U1);
++ for (; s1 < end; s1++, s2++) {
++ int diff = tolower(*s1) - tolower(*s2);
+ if (diff)
+ return diff;
+ }
@@ ref-filter.c: static int compare_detached_head(struct ref_array_item *a, struct
static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, struct ref_array_item *b)
{
struct atom_value *va, *vb;
-@@ ref-filter.c: static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, stru
- int cmp_detached_head = 0;
- cmp_type cmp_type = used_atom[s->atom].type;
- struct strbuf err = STRBUF_INIT;
-+ size_t slen = 0;
-
- if (get_ref_atom_value(a, s->atom, &va, &err))
- die("%s", err.buf);
@@ ref-filter.c: static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array_item *a, stru
} else if (s->sort_flags & REF_SORTING_VERSION) {
cmp = versioncmp(va->s, vb->s);
@@ ref-filter.c: static int cmp_ref_sorting(struct ref_sorting *s, struct ref_array
+ int (*cmp_fn)(const void *, const void *, size_t);
+ cmp_fn = s->sort_flags & REF_SORTING_ICASE
+ ? memcasecmp : memcmp;
++ size_t a_size = va->s_size == ATOM_VALUE_S_SIZE_INIT ?
++ strlen(va->s) : va->s_size;
++ size_t b_size = vb->s_size == ATOM_VALUE_S_SIZE_INIT ?
++ strlen(vb->s) : vb->s_size;
+
-+ if (va->s_size != ATOM_VALUE_S_SIZE_INIT &&
-+ vb->s_size != ATOM_VALUE_S_SIZE_INIT) {
-+ cmp = cmp_fn(va->s, vb->s, va->s_size > vb->s_size ?
-+ vb->s_size : va->s_size);
-+ } else if (va->s_size == ATOM_VALUE_S_SIZE_INIT) {
-+ slen = strlen(va->s);
-+ cmp = cmp_fn(va->s, vb->s, slen > vb->s_size ?
-+ vb->s_size : slen);
-+ } else {
-+ slen = strlen(vb->s);
-+ cmp = cmp_fn(va->s, vb->s, slen > va->s_size ?
-+ slen : va->s_size);
++ cmp = cmp_fn(va->s, vb->s, b_size > a_size ?
++ a_size : b_size);
++ if (!cmp) {
++ if (a_size > b_size)
++ cmp = 1;
++ else if (a_size < b_size)
++ cmp = -1;
+ }
-+ cmp = cmp ? cmp : va->s_size - vb->s_size;
+ }
} else {
if (va->value < vb->value)
@@ t/t6300-for-each-ref.sh: test_atom refs/myblobs/first contents:body ""
+ refs/myblobs/first not empty
+ EOF
+ git for-each-ref --format="%(refname) %(if)%(raw)%(then)not empty%(else)empty%(end)" \
-+ refs/myblobs/ >actual &&
++ refs/myblobs/ >actual &&
+ test_cmp expected actual
+'
+
@@ t/t6300-for-each-ref.sh: test_atom refs/myblobs/first contents:body ""
+ test_must_fail git for-each-ref --format="%(raw)" --sort=raw --shell
+'
+
++test_expect_success '%(raw:size) with --shell' '
++ git for-each-ref --format="%(raw:size)" | while read line
++ do
++ echo "'\''$line'\''" >>expect
++ done &&
++ git for-each-ref --format="%(raw:size)" --shell >actual &&
++ test_cmp expect actual
++'
++
+test_expect_success 'for-each-ref --format compare with cat-file --batch' '
+ git rev-parse refs/mytrees/first | git cat-file --batch >expected &&
+ git for-each-ref --format="%(objectname) %(objecttype) %(objectsize)
2: aa6d73f3e526 < -: ------------ [GSOC] ref-filter: add %(header) atom
--
gitgitgadget
next prev parent reply other threads:[~2021-05-30 13:02 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-27 14:43 [PATCH 0/2] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
2021-05-27 14:43 ` [PATCH 1/2] " ZheNing Hu via GitGitGadget
2021-05-27 16:36 ` Felipe Contreras
2021-05-28 13:02 ` ZheNing Hu
2021-05-28 16:30 ` Felipe Contreras
2021-05-30 5:37 ` ZheNing Hu
2021-05-29 13:23 ` Phillip Wood
2021-05-29 15:24 ` Felipe Contreras
2021-05-29 17:23 ` Phillip Wood
2021-05-30 6:29 ` ZheNing Hu
2021-05-30 13:05 ` Phillip Wood
2021-05-31 14:15 ` ZheNing Hu
2021-05-31 15:35 ` Felipe Contreras
2021-05-30 6:26 ` ZheNing Hu
2021-05-30 13:02 ` Phillip Wood
2021-05-28 3:03 ` Junio C Hamano
2021-05-28 15:04 ` ZheNing Hu
2021-05-28 16:38 ` Felipe Contreras
2021-05-30 8:11 ` ZheNing Hu
2021-05-27 14:43 ` [PATCH 2/2] [GSOC] ref-filter: add %(header) atom ZheNing Hu via GitGitGadget
2021-05-27 16:37 ` Felipe Contreras
2021-05-28 3:06 ` Junio C Hamano
2021-05-28 4:36 ` Junio C Hamano
2021-05-28 15:19 ` ZheNing Hu
2021-05-27 15:39 ` [PATCH 0/2] [GSOC] ref-filter: add %(raw) atom Felipe Contreras
2021-05-30 13:01 ` ZheNing Hu via GitGitGadget [this message]
2021-05-30 13:01 ` [PATCH v2 1/2] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
2021-05-31 5:34 ` Junio C Hamano
2021-05-30 13:01 ` [PATCH v2 2/2] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
2021-05-31 0:44 ` Junio C Hamano
2021-05-31 14:35 ` ZheNing Hu
2021-06-01 9:54 ` Junio C Hamano
2021-06-01 11:05 ` ZheNing Hu
2021-05-31 4:04 ` Junio C Hamano
2021-05-31 14:40 ` ZheNing Hu
2021-06-01 8:54 ` Junio C Hamano
2021-06-01 11:00 ` ZheNing Hu
2021-06-01 13:48 ` Johannes Schindelin
2021-05-31 4:10 ` Junio C Hamano
2021-05-31 15:41 ` Felipe Contreras
2021-06-01 10:37 ` ZheNing Hu
2021-06-01 14:37 [PATCH 0/2] " ZheNing Hu via GitGitGadget
2021-06-04 12:12 ` [PATCH v2 " ZheNing Hu via GitGitGadget
2021-06-04 12:53 ` Christian Couder
2021-06-05 4:34 ` ZheNing Hu
2021-06-05 4:49 ` Christian Couder
2021-06-05 5:42 ` ZheNing Hu
2021-06-05 6:45 ` Christian Couder
2021-06-05 8:05 ` ZheNing Hu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.963.v2.git.1622379718.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=adlternative@gmail.com \
--cc=bagasdotme@gmail.com \
--cc=christian.couder@gmail.com \
--cc=felipe.contreras@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=hariom18599@gmail.com \
--cc=karthik.188@gmail.com \
--cc=peff@peff.net \
--cc=phillip.wood123@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.