All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Eric Sunshine <sunshine@sunshineco.com>, Git List <git@vger.kernel.org>
Subject: Re: [PATCH 4/8] add functions for memory-efficient bitmaps
Date: Tue, 1 Jul 2014 13:18:00 -0400	[thread overview]
Message-ID: <20140701171759.GB7282@sigill.intra.peff.net> (raw)
In-Reply-To: <xmqqegy5xbiu.fsf@gitster.dls.corp.google.com>

On Tue, Jul 01, 2014 at 09:57:13AM -0700, Junio C Hamano wrote:

> Another thing I noticed was that the definition of and the
> commentary on bitset_equal() and bitset_empty() sounded somewhat
> "undecided".  These functions take "max" that is deliberately named
> differently from "num_bits" (the width of the bitsets involved),
> inviting to use them for testing only earlier bits in the bitset as
> long as the caller understands the caveat, but the caveat requires
> that the partial bitset to test must be byte-aligned, which makes it
> not very useful in practice, which means we probably do not want
> them to be used for any "max" other than "num_bits".

Yeah, I added that comment because I found "max" to be confusing, but
couldn't think of a better name. I'm not sure why "num_bits" did not
occur to me, as that makes it completely obvious.

>  * take "num_bits", not "max", to clarify that callers must use them
>    only on the full bitset.

This seems like the right solution to me. Handling partially aligned
bytes adds to the complexity and may hurt performance (in fact, I think
bitset_equal could actually just call memcmp, which I should fix).
That's fine if callers care about that feature, but I actually don't
anticipate any that do.

By the way, I chose "unsigned char" as the storage format somewhat
arbitrarily. Performance might be better with "unsigned int" or even
"unsigned long". It means potentially wasting more space, but not more
than one word (minus a byte) per commit (so about 3MB on linux.git).
I'll try to do some timings to see if it's worth doing.

> In either case, there needs another item in the "caller's responsibility"
> list at the beginning of bitset.h:
> 
>     4. Ensure that padding bits at the end of the bitset array are
>        initialized to 0.

Agreed. That is definitely a requirement I had in mind, but I didn't
think to write it down.

I'll fix both points in the re-roll.

-Peff

  reply	other threads:[~2014-07-01 17:18 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-25 23:34 [PATCH 0/8] use merge-base for tag --contains Jeff King
2014-06-25 23:35 ` [PATCH 1/8] tag: allow --sort with -n Jeff King
2014-06-25 23:35 ` [PATCH 2/8] tag: factor out decision to stream tags Jeff King
2014-06-25 23:39 ` [PATCH 3/8] paint_down_to_common: use prio_queue Jeff King
2014-07-01 16:23   ` Junio C Hamano
2014-07-01 17:10     ` Jeff King
2014-06-25 23:40 ` [PATCH 4/8] add functions for memory-efficient bitmaps Jeff King
2014-06-26  3:15   ` Torsten Bögershausen
2014-06-26 15:51     ` Jeff King
2014-06-29  7:41   ` Eric Sunshine
2014-06-30 17:07     ` Jeff King
2014-07-01 16:57       ` Junio C Hamano
2014-07-01 17:18         ` Jeff King [this message]
2014-06-25 23:42 ` [PATCH 5/8] string-list: add pos to iterator callback Jeff King
2014-07-01 17:45   ` Junio C Hamano
2014-07-01 19:00     ` Jeff King
2014-06-25 23:47 ` [PATCH 6/8] commit: provide a fast multi-tip contains function Jeff King
2014-06-26 18:55   ` Junio C Hamano
2014-06-26 19:19     ` Junio C Hamano
2014-06-26 19:26       ` Junio C Hamano
2014-07-01 18:16       ` Junio C Hamano
2014-07-01 19:14         ` Junio C Hamano
2014-06-25 23:49 ` [PATCH 7/8] tag: use commit_contains Jeff King
2014-06-25 23:53 ` [PATCH 8/8] perf: add tests for tag --contains Jeff King
2014-06-26  0:01   ` Jeff King
2014-06-26  0:04     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140701171759.GB7282@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.