From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 900C4C433DB for ; Tue, 9 Mar 2021 16:57:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4D07E6522A for ; Tue, 9 Mar 2021 16:57:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230391AbhCIQ5O (ORCPT ); Tue, 9 Mar 2021 11:57:14 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34370 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229775AbhCIQ5L (ORCPT ); Tue, 9 Mar 2021 11:57:11 -0500 Received: from mail-pg1-x530.google.com (mail-pg1-x530.google.com [IPv6:2607:f8b0:4864:20::530]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D3CDAC06174A for ; Tue, 9 Mar 2021 08:57:10 -0800 (PST) Received: by mail-pg1-x530.google.com with SMTP id g4so9174724pgj.0 for ; Tue, 09 Mar 2021 08:57:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=BkrL4L5G/8bnKy5DMlvTjJcaRoTEwQssaMMsYOi8CHI=; b=DJsFibwoxcWrdBNLKquqcbTo/H0Xc6drT2Mx0uuHDqGqAgdG/MYvhDppwBYHUamFk6 bugfbg6nOmyAn6/sEvmKuXGidFKjO6fUPv9ZVBxCcFT7t01ANjzMgvOeaIP/W1A6QWqH h4QhiKlSHzm45drzbeBzqqbAY2SuijdwN9jgcHb7KpxgeK0Qh3LrKdy6gw0OsvcgkCfs EBH4cKd8DlcKlOQmCZOvqX3ouW0Biy8mKjBbnWj944vKN9pYzgVhaU4uM4ZSwRciDv9r CVgeBqeusJoQ0/aNDDZW93OO8J/4tJhbL8PkwwFRa5fO5Um1j3tWJ7tt8z5ErsHXFYsc 4wzw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=BkrL4L5G/8bnKy5DMlvTjJcaRoTEwQssaMMsYOi8CHI=; b=Z+qJ1Tg7b91+6MiXWCw3EuTPc33zb1Wf9L3jXn2c393O8A1rGDV9oMe+zRJeSn8oLq rKhHmlefU/+MR+PgdFRBYsUATXxxY09LiLJRvlJeQLahXyn6GqV9rJob/mDp8t/ZaAYu RRK9X8BtcUM5ZrV++mfceWKCiTh6a8ZxrrsL91UuvZK3dC0fMeLG/3Q7a5tM3Ct5Ozvv e6FMrD6Z/4fB7Us7J5WTcCRNbSSP22cYRefg4xhS+qTX7SrofXtzU9arPJDr/hhbndgs hrYgTazyYfLqECdiG1CIE2vQpywHAPAr1oF6NAyIaZO879oT/HiWETAxdfkn2RaKy3d6 wjxA== X-Gm-Message-State: AOAM530tnfDzAD+SdVCkdu1yOroASwurrBMC/54opFp9KTP/XdVUiR83 VEc/xDC+qyucTPCh5p2Xg/1DYZKqZfVXlyhMkVtlWFeII6A= X-Google-Smtp-Source: ABdhPJzlyp3Qt+Nhh9SI1Zp7uffZWmLeJU3YTzDgZSUvI3KwB12XXu/qJfNAtTc0WneyHH79YNc4pNIvRAi1DxSv7rg= X-Received: by 2002:a63:1350:: with SMTP id 16mr25645682pgt.85.1615309030142; Tue, 09 Mar 2021 08:57:10 -0800 (PST) MIME-Version: 1.0 References: <20210308022138.28166-1-avarab@gmail.com> <20210308150650.18626-7-avarab@gmail.com> In-Reply-To: <20210308150650.18626-7-avarab@gmail.com> From: Elijah Newren Date: Tue, 9 Mar 2021 08:56:59 -0800 Message-ID: Subject: Re: [PATCH 06/30] cache.h: have base_name_compare() take "is tree?", not "mode" To: =?UTF-8?B?w4Z2YXIgQXJuZmrDtnLDsCBCamFybWFzb24=?= Cc: Git Mailing List , Junio C Hamano , Kirill Smelkov , =?UTF-8?B?Tmd1eeG7hW4gVGjDoWkgTmfhu41jIER1eQ==?= Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On Mon, Mar 8, 2021 at 7:07 AM =C3=86var Arnfj=C3=B6r=C3=B0 Bjarmason wrote: > > Change the base_name_compare() API and the related df_name_compare() > function to take a boolean argument indicating whether the entry is a > tree or not, instead of having them call S_ISDIR(mode) on their own. > > This makes use of the new "object_type" field in the "name_entry". > > The API being modified here was originally added way back in > 958ba6c96eb (Introduce "base_name_compare()" helper function, > 2005-05-20). > > Signed-off-by: =C3=86var Arnfj=C3=B6r=C3=B0 Bjarmason > --- > builtin/fast-import.c | 8 ++++---- > builtin/mktree.c | 4 ++-- > cache.h | 4 ++-- > combine-diff.c | 8 +++++--- > match-trees.c | 6 ++++-- > merge-ort.c | 4 ++-- > merge-recursive.c | 6 +++--- > read-cache.c | 16 ++++++++-------- > tree-diff.c | 7 +++++-- > unpack-trees.c | 15 ++++++++------- > 10 files changed, 43 insertions(+), 35 deletions(-) > > diff --git a/builtin/fast-import.c b/builtin/fast-import.c > index dd4d09ceceb..ce4613c1595 100644 > --- a/builtin/fast-import.c > +++ b/builtin/fast-import.c > @@ -1288,8 +1288,8 @@ static int tecmp0 (const void *_a, const void *_b) > struct tree_entry *a =3D *((struct tree_entry**)_a); > struct tree_entry *b =3D *((struct tree_entry**)_b); > return base_name_compare( > - a->name->str_dat, a->name->str_len, a->versions[0].mode, > - b->name->str_dat, b->name->str_len, b->versions[0].mode); > + a->name->str_dat, a->name->str_len, 1, > + b->name->str_dat, b->name->str_len, 1); > } > > static int tecmp1 (const void *_a, const void *_b) > @@ -1297,8 +1297,8 @@ static int tecmp1 (const void *_a, const void *_b) > struct tree_entry *a =3D *((struct tree_entry**)_a); > struct tree_entry *b =3D *((struct tree_entry**)_b); > return base_name_compare( > - a->name->str_dat, a->name->str_len, a->versions[1].mode, > - b->name->str_dat, b->name->str_len, b->versions[1].mode); > + a->name->str_dat, a->name->str_len, 1, > + b->name->str_dat, b->name->str_len, 1); > } You've treated all files as directories, and in doing so broken the sorting order of paths like the following foo foo.txt foo needs to sort after foo.txt if it's a directory, and before foo.txt if it's a file; that's the whole reason base_name_compare() wants to know is_tree-ness. It looks like we have a whole in our regression test coverage, since they didn't catch this breakage. (git will accept and work with trees in the wrong sort-order, but it can create confusing diffs and whatnot.) > static void mktree(struct tree_content *t, int v, struct strbuf *b) > diff --git a/builtin/mktree.c b/builtin/mktree.c > index 891991b00d6..2c1973229ac 100644 > --- a/builtin/mktree.c > +++ b/builtin/mktree.c > @@ -37,8 +37,8 @@ static int ent_compare(const void *a_, const void *b_) > { > struct treeent *a =3D *(struct treeent **)a_; > struct treeent *b =3D *(struct treeent **)b_; > - return base_name_compare(a->name, a->len, a->mode, > - b->name, b->len, b->mode); > + return base_name_compare(a->name, a->len, S_ISDIR(a->mode), > + b->name, b->len, S_ISDIR(b->mode)); > } > > static void write_tree(struct object_id *oid) > diff --git a/cache.h b/cache.h > index e513f0ee5b4..49994dae916 100644 > --- a/cache.h > +++ b/cache.h > @@ -1506,8 +1506,8 @@ int repo_interpret_branch_name(struct repository *r= , > > int validate_headref(const char *ref); > > -int base_name_compare(const char *name1, int len1, int mode1, const char= *name2, int len2, int mode2); > -int df_name_compare(const char *name1, int len1, int mode1, const char *= name2, int len2, int mode2); > +int base_name_compare(const char *name1, int len1, int isdir1, const cha= r *name2, int len2, int isdir2); > +int df_name_compare(const char *name1, int len1, int isdir1, const char = *name2, int len2, int isdir2); A very minor nit/question: Do we want to call this isdir or istree? I know we use S_ISDIR() but that's because it's a pre-existing macro pre-dating git. As you mentioned in your commit message, we want to know whether the object in question is a git "tree", so I'm inclined to go with istree. But, I'm probably bikeshedding here. > int name_compare(const char *name1, size_t len1, const char *name2, size= _t len2); > int cache_name_stage_compare(const char *name1, int len1, int stage1, co= nst char *name2, int len2, int stage2); > > diff --git a/combine-diff.c b/combine-diff.c > index 9228aebc16b..64d7aaf1710 100644 > --- a/combine-diff.c > +++ b/combine-diff.c > @@ -16,11 +16,13 @@ > static int compare_paths(const struct combine_diff_path *one, > const struct diff_filespec *two) > { > - if (!S_ISDIR(one->mode) && !S_ISDIR(two->mode)) > + int isdir_one =3D S_ISDIR(one->mode); > + int isdir_two =3D S_ISDIR(two->mode); > + if (!isdir_one && !isdir_two) > return strcmp(one->path, two->path); > > - return base_name_compare(one->path, strlen(one->path), one->mode, > - two->path, strlen(two->path), two->mode)= ; > + return base_name_compare(one->path, strlen(one->path), isdir_one, > + two->path, strlen(two->path), isdir_two)= ; > } > > static int filename_changed(char status) > diff --git a/match-trees.c b/match-trees.c > index 1011357ad0c..4d124a0fd1b 100644 > --- a/match-trees.c > +++ b/match-trees.c > @@ -67,8 +67,10 @@ static void *fill_tree_desc_strict(struct tree_desc *d= esc, > static int base_name_entries_compare(const struct name_entry *a, > const struct name_entry *b) > { > - return base_name_compare(a->path, tree_entry_len(a), a->mode, > - b->path, tree_entry_len(b), b->mode); > + int isdira =3D a->object_type =3D=3D OBJ_TREE; > + int isdirb =3D b->object_type =3D=3D OBJ_TREE; > + return base_name_compare(a->path, tree_entry_len(a), isdira, > + b->path, tree_entry_len(b), isdirb); > } > > /* > diff --git a/merge-ort.c b/merge-ort.c > index 603d30c5217..4075d13aaab 100644 > --- a/merge-ort.c > +++ b/merge-ort.c > @@ -2390,8 +2390,8 @@ static int tree_entry_order(const void *a_, const v= oid *b_) > > const struct merged_info *ami =3D a->util; > const struct merged_info *bmi =3D b->util; > - return base_name_compare(a->string, strlen(a->string), ami->resul= t.mode, > - b->string, strlen(b->string), bmi->resul= t.mode); > + return base_name_compare(a->string, strlen(a->string), S_ISDIR(am= i->result.mode), > + b->string, strlen(b->string), S_ISDIR(bm= i->result.mode)); > } > > static void write_tree(struct object_id *result_oid, > diff --git a/merge-recursive.c b/merge-recursive.c > index 1593f374495..0baa6b5ca56 100644 > --- a/merge-recursive.c > +++ b/merge-recursive.c > @@ -554,12 +554,12 @@ static int string_list_df_name_compare(const char *= one, const char *two) > * > * To achieve this, we sort with df_name_compare and provide > * the mode S_IFDIR so that D/F conflicts will sort correctly. > - * We use the mode S_IFDIR for everything else for simplicity, > + * We say we have a directory for everything else for simplicity, > * since in other cases any changes in their order due to > * sorting cause no problems for us. > */ > - int cmp =3D df_name_compare(one, onelen, S_IFDIR, > - two, twolen, S_IFDIR); > + int cmp =3D df_name_compare(one, onelen, 1, two, twolen, 1); > + > /* > * Now that 'foo' and 'foo/bar' compare equal, we have to make su= re > * that 'foo' comes before 'foo/bar'. > diff --git a/read-cache.c b/read-cache.c > index 1e9a50c6c73..edfce1f0cb8 100644 > --- a/read-cache.c > +++ b/read-cache.c > @@ -462,8 +462,8 @@ int ie_modified(struct index_state *istate, > return 0; > } > > -int base_name_compare(const char *name1, int len1, int mode1, > - const char *name2, int len2, int mode2) > +int base_name_compare(const char *name1, int len1, int isdir1, > + const char *name2, int len2, int isdir2) > { > unsigned char c1, c2; > int len =3D len1 < len2 ? len1 : len2; > @@ -474,9 +474,9 @@ int base_name_compare(const char *name1, int len1, in= t mode1, > return cmp; > c1 =3D name1[len]; > c2 =3D name2[len]; > - if (!c1 && S_ISDIR(mode1)) > + if (!c1 && isdir1) > c1 =3D '/'; > - if (!c2 && S_ISDIR(mode2)) > + if (!c2 && isdir2) > c2 =3D '/'; > return (c1 < c2) ? -1 : (c1 > c2) ? 1 : 0; > } > @@ -491,8 +491,8 @@ int base_name_compare(const char *name1, int len1, in= t mode1, > * This is used by routines that want to traverse the git namespace > * but then handle conflicting entries together when possible. > */ > -int df_name_compare(const char *name1, int len1, int mode1, > - const char *name2, int len2, int mode2) > +int df_name_compare(const char *name1, int len1, int isdir1, > + const char *name2, int len2, int isdir2) > { > int len =3D len1 < len2 ? len1 : len2, cmp; > unsigned char c1, c2; > @@ -504,10 +504,10 @@ int df_name_compare(const char *name1, int len1, in= t mode1, > if (len1 =3D=3D len2) > return 0; > c1 =3D name1[len]; > - if (!c1 && S_ISDIR(mode1)) > + if (!c1 && isdir1) > c1 =3D '/'; > c2 =3D name2[len]; > - if (!c2 && S_ISDIR(mode2)) > + if (!c2 && isdir2) > c2 =3D '/'; > if (c1 =3D=3D '/' && !c2) > return 0; > diff --git a/tree-diff.c b/tree-diff.c > index 7cebbb327e2..f133834597c 100644 > --- a/tree-diff.c > +++ b/tree-diff.c > @@ -50,6 +50,7 @@ static int tree_entry_pathcmp(struct tree_desc *t1, str= uct tree_desc *t2) > { > struct name_entry *e1, *e2; > int cmp; > + int e1_is_tree, e2_is_tree; > > /* empty descriptors sort after valid tree entries */ > if (!t1->size) > @@ -58,9 +59,11 @@ static int tree_entry_pathcmp(struct tree_desc *t1, st= ruct tree_desc *t2) > return -1; > > e1 =3D &t1->entry; > + e1_is_tree =3D e1->object_type =3D=3D OBJ_TREE; This would read slightly clearer with parenthesis around the right hand sid= e. > e2 =3D &t2->entry; > - cmp =3D base_name_compare(e1->path, tree_entry_len(e1), e1->mode, > - e2->path, tree_entry_len(e2), e2->mode); > + e2_is_tree =3D e2->object_type =3D=3D OBJ_TREE; Same. > + cmp =3D base_name_compare(e1->path, tree_entry_len(e1), e1_is_tre= e, > + e2->path, tree_entry_len(e2), e2_is_tree)= ; > return cmp; > } > > diff --git a/unpack-trees.c b/unpack-trees.c > index f5f668f532d..802f7771d75 100644 > --- a/unpack-trees.c > +++ b/unpack-trees.c > @@ -922,7 +922,7 @@ static int traverse_trees_recursive(int n, unsigned l= ong dirmask, > static int do_compare_entry_piecewise(const struct cache_entry *ce, > const struct traverse_info *info, > const char *name, size_t namelen, > - unsigned mode) > + unsigned is_tree) isdir elsewhere and is_tree here? ;-) > { > int pathlen, ce_len; > const char *ce_name; > @@ -930,7 +930,7 @@ static int do_compare_entry_piecewise(const struct ca= che_entry *ce, > if (info->prev) { > int cmp =3D do_compare_entry_piecewise(ce, info->prev, > info->name, info->na= melen, > - info->mode); > + S_ISDIR(info->mode))= ; > if (cmp) > return cmp; > } > @@ -944,13 +944,13 @@ static int do_compare_entry_piecewise(const struct = cache_entry *ce, > ce_len -=3D pathlen; > ce_name =3D ce->name + pathlen; > > - return df_name_compare(ce_name, ce_len, S_IFREG, name, namelen, m= ode); > + return df_name_compare(ce_name, ce_len, 0, name, namelen, is_tree= ); > } > > static int do_compare_entry(const struct cache_entry *ce, > const struct traverse_info *info, > const char *name, size_t namelen, > - unsigned mode) > + unsigned is_tree) > { > int pathlen, ce_len; > const char *ce_name; > @@ -962,7 +962,7 @@ static int do_compare_entry(const struct cache_entry = *ce, > * it is quicker to use the precomputed version. > */ > if (!info->traverse_path) > - return do_compare_entry_piecewise(ce, info, name, namelen= , mode); > + return do_compare_entry_piecewise(ce, info, name, namelen= , is_tree); > > cmp =3D strncmp(ce->name, info->traverse_path, info->pathlen); > if (cmp) > @@ -977,12 +977,13 @@ static int do_compare_entry(const struct cache_entr= y *ce, > ce_len -=3D pathlen; > ce_name =3D ce->name + pathlen; > > - return df_name_compare(ce_name, ce_len, S_IFREG, name, namelen, m= ode); > + return df_name_compare(ce_name, ce_len, 0, name, namelen, is_tree= ); > } > > static int compare_entry(const struct cache_entry *ce, const struct trav= erse_info *info, const struct name_entry *n) > { > - int cmp =3D do_compare_entry(ce, info, n->path, n->pathlen, n->mo= de); > + int is_tree =3D n->object_type =3D=3D OBJ_TREE; Another place where parentheses would speed my reading comprehension just slightly. > + int cmp =3D do_compare_entry(ce, info, n->path, n->pathlen, is_tr= ee); > if (cmp) > return cmp; > > -- > 2.31.0.rc0.126.g04f22c5b82