Git Mailing List Archive on lore.kernel.org
 help / color / Atom feed
From: Bert Wesarg <bert.wesarg@googlemail.com>
To: Jeff King <peff@peff.net>
Cc: "Jonathan Nieder" <jrnieder@gmail.com>,
	git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
	"Jakub Narebski" <jnareb@gmail.com>, "Ted Ts\'o" <tytso@mit.edu>,
	"Ævar Arnfjörð" <avarab@gmail.com>,
	"Clemens Buchacher" <drizzd@aon.at>,
	"Shawn O. Pearce" <spearce@spearce.org>,
	"David Barr" <davidbarr@google.com>
Subject: Re: [PATCH 1/3] implement generic key/value map
Date: Thu, 14 Jul 2011 20:52:04 +0200
Message-ID: <CAKPyHN0-VbzjMaMJFZeGGrGX6HuGNEBHNVNf0cexB2vu21_13g@mail.gmail.com> (raw)
In-Reply-To: <20110714175105.GA21771@sigill.intra.peff.net>

On Thu, Jul 14, 2011 at 19:51, Jeff King <peff@peff.net> wrote:
> It is frequently useful to have a fast, generic data
> structure mapping keys to values. We already have something
> like this in the "decorate" API, but it has two downsides:
>
>  1. The key type must always be a "struct object *".
>
>  2. The value type is a void pointer, which means it is
>     inefficient and cumbersome for storing small values.
>     One must either encode their value inside the void
>     pointer, or allocate additional storage for the pointer
>     to point to.
>
> This patch introduces a generic map data structure, mapping
> keys of arbitrary type to values of arbitrary type.
>
> One possible strategy for implementation is to have a struct
> that points to a sequence of bytes for each of the key and
> the value, and to try to treat them as opaque in the code.
> However, this code gets complex, has a lot of casts, and
> runs afoul of violating alignment and strict aliasing rules.
>
> This patch takes a different approach. We parameterize the
> types in each map by putting the declarations and
> implementations inside macros, and expand the macros with
> the correct types. This lets the compiler see the actual
> code, with its real types, and figure out things like struct
> packing and alignment itself.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> In addition to switching from using void pointers to macro expansion,
> this has one other difference from my previous patch: it handles
> arbitrary types for keys, not just object pointers. This was mentioned
> by Jakub, and would allow things like a fast bi-directional map for SVN
> revision numbers and commits.
>
> I tried to keep the implementation simple. Two things that could be changed:
>
>  1. We can't assume that the map key is a pointer. So the sentinel
>     "NULL" value isn't necessarily available to use, and we have to
>     keep a separate bit in each hash entry to say "is this valid".
>     This means when we _do_ store a pointer, we end up with an extra
>     32 bits or so in each hash entry for the "used" flag.
>
>     We could add a macro parameter for sentinel values, so that types
>     which _can_ handle this efficiently don't have to pay the price.
>     Or we could decide that mapping arbitrary keys isn't worth the
>     hassle. I wrote this way to be flexible for future use; I don't
>     personally have plans to add svn revision number mappings.
>
>  2. It assumes values are assignable. That means storing something like
>     "unsigned char sha1[20]" doesn't work. You can wrap it in a struct,
>     but do we assume that struct assignment works everywhere? I didn't
>     check, but I think it is in C89 but some antique compilers didn't
>     allow it. Switching it to use memcpy() would be easy enough (or
>     again, parameterizing so that assignable things don't have to pay
>     the price).
>
>  Makefile |    2 +
>  map.c    |   86 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  map.h    |   24 +++++++++++++++++
>  3 files changed, 112 insertions(+), 0 deletions(-)
>  create mode 100644 map.c
>  create mode 100644 map.h
>
> diff --git a/Makefile b/Makefile
> index 46793d1..6242321 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -530,6 +530,7 @@ LIB_H += list-objects.h
>  LIB_H += ll-merge.h
>  LIB_H += log-tree.h
>  LIB_H += mailmap.h
> +LIB_H += map.h
>  LIB_H += merge-file.h
>  LIB_H += merge-recursive.h
>  LIB_H += notes.h
> @@ -621,6 +622,7 @@ LIB_OBJS += ll-merge.o
>  LIB_OBJS += lockfile.o
>  LIB_OBJS += log-tree.o
>  LIB_OBJS += mailmap.o
> +LIB_OBJS += map.o
>  LIB_OBJS += match-trees.o
>  LIB_OBJS += merge-file.o
>  LIB_OBJS += merge-recursive.o
> diff --git a/map.c b/map.c
> new file mode 100644
> index 0000000..378cecb
> --- /dev/null
> +++ b/map.c
> @@ -0,0 +1,86 @@
> +#include "cache.h"
> +#include "map.h"
> +#include "object.h"
> +
> +static unsigned int hash_obj(const struct object *obj, unsigned int n)
> +{
> +       unsigned int hash;
> +
> +       memcpy(&hash, obj->sha1, sizeof(unsigned int));
> +       return hash % n;
> +}
> +
> +static unsigned int cmp_obj(const struct object *a, const struct object *b)
> +{
> +       return b == a;
> +}
> +
> +#define MAP_IMPLEMENT(name, ktype, vtype, cmp_fun, hash_fun) \

This define should probably in the header too. Else this is completely useless.

Bert

  reply index

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-13  6:47 [RFC/PATCHv2 0/6] generation numbers for faster traversals Jeff King
2011-07-13  6:57 ` [RFC/PATCHv2 1/6] decorate: allow storing values instead of pointers Jeff King
2011-07-13 17:52   ` Jonathan Nieder
2011-07-13 20:08     ` Jeff King
2011-07-14 17:34       ` Jeff King
2011-07-14 17:51         ` [PATCH 1/3] implement generic key/value map Jeff King
2011-07-14 18:52           ` Bert Wesarg [this message]
2011-07-14 18:54             ` Bert Wesarg
2011-07-14 18:55               ` Jeff King
2011-07-14 19:07                 ` Bert Wesarg
2011-07-14 19:14                   ` Jeff King
2011-07-14 19:18                     ` Bert Wesarg
2011-07-14 17:52         ` [PATCH 2/3] fast-export: use object to uint32 map instead of "decorate" Jeff King
2011-07-15  9:40           ` Sverre Rabbelier
2011-07-15 20:00             ` Jeff King
2011-07-14 17:53         ` [PATCH 3/3] decorate: use "map" for the underlying implementation Jeff King
2011-07-14 21:06         ` [RFC/PATCHv2 1/6] decorate: allow storing values instead of pointers Junio C Hamano
2011-08-04 22:43           ` [RFC/PATCH 0/5] macro-based key/value maps Jeff King
2011-08-04 22:45             ` [PATCH 1/5] implement generic key/value map Jeff King
2011-08-04 22:46             ` [PATCH 2/5] fast-export: use object to uint32 map instead of "decorate" Jeff King
2011-08-04 22:46             ` [PATCH 3/5] decorate: use "map" for the underlying implementation Jeff King
2011-08-04 22:46             ` [PATCH 4/5] map: implement persistent maps Jeff King
2011-08-04 22:46             ` [PATCH 5/5] implement metadata cache subsystem Jeff King
2011-08-05 11:03             ` [RFC/PATCH 0/5] macro-based key/value maps Jeff King
2011-08-05 15:31               ` René Scharfe
2011-08-06  6:30                 ` Jeff King
2011-07-13  7:04 ` [RFC/PATCHv2 2/6] add metadata-cache infrastructure Jeff King
2011-07-13  8:18   ` Bert Wesarg
2011-07-13  8:31     ` Jeff King
2011-07-13  8:45       ` Bert Wesarg
2011-07-13 19:18         ` Jeff King
2011-07-13 19:40       ` Junio C Hamano
2011-07-13 19:33   ` Junio C Hamano
2011-07-13 20:25     ` Jeff King
2011-07-13  7:05 ` [RFC/PATCHv2 3/6] commit: add commit_generation function Jeff King
2011-07-13 14:26   ` Eric Sunshine
2011-07-13  7:05 ` [RFC/PATCHv2 4/6] pretty: support %G to show the generation number of a commit Jeff King
2011-07-13  7:06 ` [RFC/PATCHv2 5/6] check commit generation cache validity against grafts Jeff King
2011-07-13 14:26   ` Eric Sunshine
2011-07-13 19:35     ` Jeff King
2011-07-13  7:06 ` [RFC/PATCHv2 6/6] limit "contains" traversals based on commit generation Jeff King
2011-07-13  7:23   ` Jeff King
2011-07-13 20:33     ` Junio C Hamano
2011-07-13 20:58       ` Jeff King
2011-07-13 21:12         ` Junio C Hamano
2011-07-13 21:18           ` Jeff King
2011-07-15 18:22   ` Junio C Hamano
2011-07-15 20:40     ` Jeff King
2011-07-15 21:04       ` Junio C Hamano
2011-07-15 21:14         ` Jeff King
2011-07-15 21:01 ` Generation numbers and replacement objects Jakub Narebski
2011-07-15 21:10   ` Jeff King
2011-07-16 21:10     ` Jakub Narebski
2011-08-04 22:48 [RFC/PATCH 0/2] patch-id caching Jeff King
2011-08-04 22:49 ` [PATCH 1/2] cherry: read default config Jeff King
2011-08-04 22:49 ` [PATCH 2/2] cache patch ids on disk Jeff King
2011-08-04 22:52   ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKPyHN0-VbzjMaMJFZeGGrGX6HuGNEBHNVNf0cexB2vu21_13g@mail.gmail.com \
    --to=bert.wesarg@googlemail.com \
    --cc=avarab@gmail.com \
    --cc=davidbarr@google.com \
    --cc=drizzd@aon.at \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jnareb@gmail.com \
    --cc=jrnieder@gmail.com \
    --cc=peff@peff.net \
    --cc=spearce@spearce.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Mailing List Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/git/0 git/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 git git/ https://lore.kernel.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.git


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git