All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bert Wesarg <bert.wesarg@googlemail.com>
To: Jeff King <peff@peff.net>
Cc: "Jonathan Nieder" <jrnieder@gmail.com>,
	git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
	"Jakub Narebski" <jnareb@gmail.com>, "Ted Ts'o" <tytso@mit.edu>,
	"Ævar Arnfjörð" <avarab@gmail.com>,
	"Clemens Buchacher" <drizzd@aon.at>,
	"Shawn O. Pearce" <spearce@spearce.org>,
	"David Barr" <davidbarr@google.com>
Subject: Re: [PATCH 1/3] implement generic key/value map
Date: Thu, 14 Jul 2011 20:52:04 +0200	[thread overview]
Message-ID: <CAKPyHN0-VbzjMaMJFZeGGrGX6HuGNEBHNVNf0cexB2vu21_13g@mail.gmail.com> (raw)
In-Reply-To: <20110714175105.GA21771@sigill.intra.peff.net>

On Thu, Jul 14, 2011 at 19:51, Jeff King <peff@peff.net> wrote:
> It is frequently useful to have a fast, generic data
> structure mapping keys to values. We already have something
> like this in the "decorate" API, but it has two downsides:
>
>  1. The key type must always be a "struct object *".
>
>  2. The value type is a void pointer, which means it is
>     inefficient and cumbersome for storing small values.
>     One must either encode their value inside the void
>     pointer, or allocate additional storage for the pointer
>     to point to.
>
> This patch introduces a generic map data structure, mapping
> keys of arbitrary type to values of arbitrary type.
>
> One possible strategy for implementation is to have a struct
> that points to a sequence of bytes for each of the key and
> the value, and to try to treat them as opaque in the code.
> However, this code gets complex, has a lot of casts, and
> runs afoul of violating alignment and strict aliasing rules.
>
> This patch takes a different approach. We parameterize the
> types in each map by putting the declarations and
> implementations inside macros, and expand the macros with
> the correct types. This lets the compiler see the actual
> code, with its real types, and figure out things like struct
> packing and alignment itself.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> In addition to switching from using void pointers to macro expansion,
> this has one other difference from my previous patch: it handles
> arbitrary types for keys, not just object pointers. This was mentioned
> by Jakub, and would allow things like a fast bi-directional map for SVN
> revision numbers and commits.
>
> I tried to keep the implementation simple. Two things that could be changed:
>
>  1. We can't assume that the map key is a pointer. So the sentinel
>     "NULL" value isn't necessarily available to use, and we have to
>     keep a separate bit in each hash entry to say "is this valid".
>     This means when we _do_ store a pointer, we end up with an extra
>     32 bits or so in each hash entry for the "used" flag.
>
>     We could add a macro parameter for sentinel values, so that types
>     which _can_ handle this efficiently don't have to pay the price.
>     Or we could decide that mapping arbitrary keys isn't worth the
>     hassle. I wrote this way to be flexible for future use; I don't
>     personally have plans to add svn revision number mappings.
>
>  2. It assumes values are assignable. That means storing something like
>     "unsigned char sha1[20]" doesn't work. You can wrap it in a struct,
>     but do we assume that struct assignment works everywhere? I didn't
>     check, but I think it is in C89 but some antique compilers didn't
>     allow it. Switching it to use memcpy() would be easy enough (or
>     again, parameterizing so that assignable things don't have to pay
>     the price).
>
>  Makefile |    2 +
>  map.c    |   86 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  map.h    |   24 +++++++++++++++++
>  3 files changed, 112 insertions(+), 0 deletions(-)
>  create mode 100644 map.c
>  create mode 100644 map.h
>
> diff --git a/Makefile b/Makefile
> index 46793d1..6242321 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -530,6 +530,7 @@ LIB_H += list-objects.h
>  LIB_H += ll-merge.h
>  LIB_H += log-tree.h
>  LIB_H += mailmap.h
> +LIB_H += map.h
>  LIB_H += merge-file.h
>  LIB_H += merge-recursive.h
>  LIB_H += notes.h
> @@ -621,6 +622,7 @@ LIB_OBJS += ll-merge.o
>  LIB_OBJS += lockfile.o
>  LIB_OBJS += log-tree.o
>  LIB_OBJS += mailmap.o
> +LIB_OBJS += map.o
>  LIB_OBJS += match-trees.o
>  LIB_OBJS += merge-file.o
>  LIB_OBJS += merge-recursive.o
> diff --git a/map.c b/map.c
> new file mode 100644
> index 0000000..378cecb
> --- /dev/null
> +++ b/map.c
> @@ -0,0 +1,86 @@
> +#include "cache.h"
> +#include "map.h"
> +#include "object.h"
> +
> +static unsigned int hash_obj(const struct object *obj, unsigned int n)
> +{
> +       unsigned int hash;
> +
> +       memcpy(&hash, obj->sha1, sizeof(unsigned int));
> +       return hash % n;
> +}
> +
> +static unsigned int cmp_obj(const struct object *a, const struct object *b)
> +{
> +       return b == a;
> +}
> +
> +#define MAP_IMPLEMENT(name, ktype, vtype, cmp_fun, hash_fun) \

This define should probably in the header too. Else this is completely useless.

Bert

  reply	other threads:[~2011-07-14 18:52 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-13  6:47 [RFC/PATCHv2 0/6] generation numbers for faster traversals Jeff King
2011-07-13  6:57 ` [RFC/PATCHv2 1/6] decorate: allow storing values instead of pointers Jeff King
2011-07-13 17:52   ` Jonathan Nieder
2011-07-13 20:08     ` Jeff King
2011-07-14 17:34       ` Jeff King
2011-07-14 17:51         ` [PATCH 1/3] implement generic key/value map Jeff King
2011-07-14 18:52           ` Bert Wesarg [this message]
2011-07-14 18:54             ` Bert Wesarg
2011-07-14 18:55               ` Jeff King
2011-07-14 19:07                 ` Bert Wesarg
2011-07-14 19:14                   ` Jeff King
2011-07-14 19:18                     ` Bert Wesarg
2011-07-14 17:52         ` [PATCH 2/3] fast-export: use object to uint32 map instead of "decorate" Jeff King
2011-07-15  9:40           ` Sverre Rabbelier
2011-07-15 20:00             ` Jeff King
2011-07-14 17:53         ` [PATCH 3/3] decorate: use "map" for the underlying implementation Jeff King
2011-07-14 21:06         ` [RFC/PATCHv2 1/6] decorate: allow storing values instead of pointers Junio C Hamano
2011-08-04 22:43           ` [RFC/PATCH 0/5] macro-based key/value maps Jeff King
2011-08-04 22:45             ` [PATCH 1/5] implement generic key/value map Jeff King
2011-08-04 22:46             ` [PATCH 2/5] fast-export: use object to uint32 map instead of "decorate" Jeff King
2011-08-04 22:46             ` [PATCH 3/5] decorate: use "map" for the underlying implementation Jeff King
2011-08-04 22:46             ` [PATCH 4/5] map: implement persistent maps Jeff King
2011-08-04 22:46             ` [PATCH 5/5] implement metadata cache subsystem Jeff King
2011-08-04 22:48             ` [RFC/PATCH 0/2] patch-id caching Jeff King
2011-08-04 22:49               ` [PATCH 1/2] cherry: read default config Jeff King
2011-08-04 22:49               ` [PATCH 2/2] cache patch ids on disk Jeff King
2011-08-04 22:52                 ` Jeff King
2011-08-05 11:03             ` [RFC/PATCH 0/5] macro-based key/value maps Jeff King
2011-08-05 15:31               ` René Scharfe
2011-08-06  6:30                 ` Jeff King
2011-07-13  7:04 ` [RFC/PATCHv2 2/6] add metadata-cache infrastructure Jeff King
2011-07-13  8:18   ` Bert Wesarg
2011-07-13  8:31     ` Jeff King
2011-07-13  8:45       ` Bert Wesarg
2011-07-13 19:18         ` Jeff King
2011-07-13 19:40       ` Junio C Hamano
2011-07-13 19:33   ` Junio C Hamano
2011-07-13 20:25     ` Jeff King
2011-07-13  7:05 ` [RFC/PATCHv2 3/6] commit: add commit_generation function Jeff King
2011-07-13 14:26   ` Eric Sunshine
2011-07-13  7:05 ` [RFC/PATCHv2 4/6] pretty: support %G to show the generation number of a commit Jeff King
2011-07-13  7:06 ` [RFC/PATCHv2 5/6] check commit generation cache validity against grafts Jeff King
2011-07-13 14:26   ` Eric Sunshine
2011-07-13 19:35     ` Jeff King
2011-07-13  7:06 ` [RFC/PATCHv2 6/6] limit "contains" traversals based on commit generation Jeff King
2011-07-13  7:23   ` Jeff King
2011-07-13 20:33     ` Junio C Hamano
2011-07-13 20:58       ` Jeff King
2011-07-13 21:12         ` Junio C Hamano
2011-07-13 21:18           ` Jeff King
2011-07-15 18:22   ` Junio C Hamano
2011-07-15 20:40     ` Jeff King
2011-07-15 21:04       ` Junio C Hamano
2011-07-15 21:14         ` Jeff King
2011-07-15 21:01 ` Generation numbers and replacement objects Jakub Narebski
2011-07-15 21:10   ` Jeff King
2011-07-16 21:10     ` Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKPyHN0-VbzjMaMJFZeGGrGX6HuGNEBHNVNf0cexB2vu21_13g@mail.gmail.com \
    --to=bert.wesarg@googlemail.com \
    --cc=avarab@gmail.com \
    --cc=davidbarr@google.com \
    --cc=drizzd@aon.at \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jnareb@gmail.com \
    --cc=jrnieder@gmail.com \
    --cc=peff@peff.net \
    --cc=spearce@spearce.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.