All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Stefan Beller <sbeller@google.com>
Cc: Junio C Hamano <gitster@pobox.com>, git <git@vger.kernel.org>
Subject: Re: [PATCH] fetch: replace string-list used as a look-up table with a hashmap
Date: Thu, 27 Sep 2018 01:54:15 -0400	[thread overview]
Message-ID: <20180927055415.GC14178@sigill.intra.peff.net> (raw)
In-Reply-To: <CAGZ79kamkLV7yHtCu_qSqNzj7bJm0=Uw1CKxss+zTKJWsYz3qw@mail.gmail.com>

On Wed, Sep 26, 2018 at 03:59:08PM -0700, Stefan Beller wrote:

> > +struct refname_hash_entry {
> > +       struct hashmap_entry ent; /* must be the first member */
> 
> $ git grep "struct hashmap_entry" reveals that we have another
> convention that we follow but not document, which is to stress
> the importance of putting the hashmap_entry first. ;-)

One thing I've liked about the list.h implementation is that you can
store the list pointer anywhere in the struct, or even have the same
struct in multiple lists.

The only funny thing is that you have to "dereference" the iterator like
this:

  struct list_head *pos;
  struct actual_thing *item;
  ...
  item = list_entry(pos, struct actual_thing, list_member);

which is a minor pain, but it's reasonably hard to get it wrong.

I wonder if we could do the same here. The hashmap would only ever see
the "struct hashmap_entry", and then the caller would convert that back
to the actual type. I think we could even get away with not converting
existing callers; if the hashmap _is_ at the front, then that
list_entry() really just devolves to a cast. So as long as the struct
definition and the users of the struct agree, it would just work.

> > +static int refname_hash_entry_cmp(const void *hashmap_cmp_fn_data,
> > +                                 const struct refname_hash_entry *e1,
> > +                                 const struct refname_hash_entry *e2,
> > +                                 const void *keydata)
> > +{
> > +       return strcmp(e1->refname, keydata ? keydata : e2->refname);
> > +}
> 
> and later
> 
>     hashmap_init(...  (hashmap_cmp_fn)refname_hash_entry_cmp, ...);
> 
> I wonder if we'd want to stick to this style, as that seems to be the easiest
> to do, and drop the efforts started in 55c965f3a2 (Merge branch
> 'sb/hashmap-cleanup', 2017-08-11), that avoids the cast in the init,
> but puts the (implicit) casts in the _cmp function as we'd take
> void pointers as the function signature and recast to a local variable.

I think that casting the function pointer technically breaks the C
standard, though probably not in a way that is a problem on any real
systems.

The other thing about the "cast inside the cmp function" approach from
sb/hashmap-cleanup is that it is less likely to go stale. If we change
the definition of hashmap_cmp_fn, then any functions which are manually
cast will suppress the compiler errors. When the function takes void
pointers, the same can happen if "struct refname_hash_entry" is swapped
out for another struct, but IMHO that's a less likely error to make.

I admit that's just a gut feeling, though. It would be nice if we could
avoid casting altogether, but I think that puts us into macro territory
(which I'm not altogether opposed to, but it has its own drawbacks).

-Peff

  reply	other threads:[~2018-09-27  5:54 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-26 21:28 [PATCH] fetch: replace string-list used as a look-up table with a hashmap Junio C Hamano
2018-09-26 22:59 ` Stefan Beller
2018-09-27  5:54   ` Jeff King [this message]
2018-09-29 18:31     ` Junio C Hamano
2018-09-30  4:57       ` Jeff King
2018-09-27  5:34 ` Jeff King
2018-09-30  2:11   ` Junio C Hamano
2018-10-19  3:48     ` [PATCH v3] " Junio C Hamano
2018-10-22  9:57       ` Johannes Schindelin
2018-10-27  6:47         ` Re*: " Junio C Hamano
2018-10-31 14:50           ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180927055415.GC14178@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.