All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Hostetler <git@jeffhostetler.com>
To: "Martin Ågren" <martin.agren@gmail.com>, git@vger.kernel.org
Cc: Jeff Hostetler <jeffhost@microsoft.com>
Subject: Re: tsan: t3008: hashmap_add touches size from multiple threads
Date: Tue, 15 Aug 2017 13:59:22 -0400	[thread overview]
Message-ID: <cff383c2-ca57-caba-5a46-7dec4abc25a4@jeffhostetler.com> (raw)
In-Reply-To: <adb37b70139fd1e2bac18bfd22c8b96683ae18eb.1502780344.git.martin.agren@gmail.com>



On 8/15/2017 8:53 AM, Martin Ågren wrote:
> Using SANITIZE=thread made t3008-ls-files-lazy-init-name-hash.sh hit
> the potential race below.
> 
> What seems to happen is, threaded_lazy_init_name_hash ends up using
> hashmap_add on the_index.dir_hash from two threads in a way that tsan
> considers racy. While the buckets each have their own mutex, the "size"
> does not. So it might end up with the wrong (lower) value. The size is
> used to determine when to resize, but that should be fine, since
> resizing is turned off due to threading anyway.
 >
> If the size is ever used for something else, the consequences might
> range from cosmetic error to devastating. I have a "feeling" the size is
> not used at the time, although maybe it is, in some roundabout way which
> I'm not seeing.

Good catch!  Yes, the size field is unguarded.  The only references to
it that I found were used in _add() and _remove() in the need-to-rehash
check.

Since auto-rehash is turned off, this shouldn't be a problem, but it
does feel odd to have a possibly incorrect size due to races.

Rather than adding something like (a cross-platform version of)
InterlockedIncrement(), I'm wondering if it would be better to
disable/invalidate the size field when auto-rehash is turned off
so we don't have to worry about it.

Something like this:


$ git diff
diff --git a/hashmap.c b/hashmap.c
index 9b6a12859b..be97642daa 100644
--- a/hashmap.c
+++ b/hashmap.c
@@ -205,6 +205,9 @@ void hashmap_add(struct hashmap *map, void *entry)
         ((struct hashmap_entry *) entry)->next = map->table[b];
         map->table[b] = entry;

+       if (map->disallow_rehash)
+               return;
+
         /* fix size and rehash if appropriate */
         map->size++;
         if (map->size > map->grow_at)
@@ -223,6 +226,9 @@ void *hashmap_remove(struct hashmap *map, const void 
*key, const void *keydata)
         *e = old->next;
         old->next = NULL;

+       if (map->disallow_rehash)
+               return;
+
         /* fix size and rehash if appropriate */
         map->size--;
         if (map->size < map->shrink_at)
diff --git a/hashmap.h b/hashmap.h
index 7a8fa7fa3d..ec9541b981 100644
--- a/hashmap.h
+++ b/hashmap.h
@@ -183,7 +183,8 @@ struct hashmap {
         const void *cmpfn_data;

         /* total number of entries (0 means the hashmap is empty) */
-       unsigned int size;
+       /* -1 means size is unknown for threading reasons */
+       int size;

         /*
          * tablesize is the allocated size of the hash table. A non-0 value
@@ -360,6 +361,15 @@ int hashmap_bucket(const struct hashmap *map, 
unsigned int hash);
  static inline void hashmap_disallow_rehash(struct hashmap *map, 
unsigned value)
  {
         map->disallow_rehash = value;
+       if (value) {
+               /*
+                * Assume threaded operations starting, so don't
+                * try to keep size current.
+                */
+               size = -1;
+       } else {
+               /* TODO count items in all buckets and set size. */
+       }
  }

  /*


Jeff

  reply	other threads:[~2017-08-15 17:59 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-15 12:53 [PATCH/RFC 0/5] Some ThreadSanitizer-results Martin Ågren
2017-08-15 12:53 ` [PATCH 1/5] convert: initialize attr_action in convert_attrs Martin Ågren
2017-08-15 14:17   ` Torsten Bögershausen
2017-08-15 14:29     ` Torsten Bögershausen
2017-08-15 14:40     ` Martin Ågren
2017-08-15 12:53 ` [PATCH 2/5] pack-objects: take lock before accessing `remaining` Martin Ågren
2017-08-15 19:50   ` Johannes Sixt
2017-08-15 12:53 ` [PATCH 3/5] Makefile: define GIT_THREAD_SANITIZER Martin Ågren
2017-08-15 12:53 ` [PATCH 4/5] strbuf_reset: don't write to slopbuf with ThreadSanitizer Martin Ågren
2017-08-15 18:43   ` Junio C Hamano
2017-08-15 19:06     ` Martin Ågren
2017-08-15 19:19       ` Junio C Hamano
2017-08-15 12:53 ` [PATCH 5/5] ThreadSanitizer: add suppressions Martin Ågren
2017-08-15 12:53 ` tsan: t3008: hashmap_add touches size from multiple threads Martin Ågren
2017-08-15 17:59   ` Jeff Hostetler [this message]
2017-08-15 18:17     ` Stefan Beller
2017-08-15 18:40       ` Martin Ågren
2017-08-15 18:48         ` Stefan Beller
2017-08-15 19:21           ` Martin Ågren
2017-08-15 20:46             ` Jeff Hostetler
2017-08-30 18:59   ` [PATCH] hashmap: address ThreadSanitizer concerns Jeff Hostetler
2017-08-30 18:59     ` [PATCH] hashmap: add API to disable item counting when threaded Jeff Hostetler
2017-09-01 23:31       ` Johannes Schindelin
2017-09-01 23:50         ` Jonathan Nieder
2017-09-05 16:39           ` Jeff Hostetler
2017-09-05 17:13             ` Martin Ågren
2017-09-02  8:17         ` Jeff King
2017-09-04 15:59           ` Johannes Schindelin
2017-09-05 16:54           ` Jeff Hostetler
2017-09-06  3:43           ` Junio C Hamano
2017-09-05 16:33         ` Jeff Hostetler
2017-09-02  8:05       ` Jeff King
2017-09-05 17:07         ` Jeff Hostetler
2017-09-02  8:39       ` Simon Ruderich
2017-09-06  1:24       ` Junio C Hamano
2017-09-06 15:33         ` Jeff Hostetler
2017-09-06 15:43     ` [PATCH v2] hashmap: address ThreadSanitizer concerns Jeff Hostetler
2017-09-06 15:43       ` [PATCH v2] hashmap: add API to disable item counting when threaded Jeff Hostetler
2017-08-15 12:53 ` tsan: t5400: set_try_to_free_routine Martin Ågren
2017-08-15 17:35   ` Stefan Beller
2017-08-15 18:44     ` Martin Ågren
2017-08-17 10:57   ` Jeff King
2017-08-20 10:06 ` [PATCH/RFC 0/5] Some ThreadSanitizer-results Jeff King
2017-08-20 10:45   ` Martin Ågren
2017-08-21 17:43 ` [PATCH v2 0/4] " Martin Ågren
2017-08-21 17:43   ` [PATCH v2 1/4] convert: always initialize attr_action in convert_attrs Martin Ågren
2017-08-21 17:43   ` [PATCH v2 2/4] pack-objects: take lock before accessing `remaining` Martin Ågren
2017-08-21 17:43   ` [PATCH v2 3/4] strbuf_setlen: don't write to strbuf_slopbuf Martin Ågren
2017-08-23 17:24     ` Junio C Hamano
2017-08-23 17:43       ` Martin Ågren
2017-08-23 18:30         ` Junio C Hamano
2017-08-23 20:37     ` Brandon Casey
2017-08-23 21:04       ` Junio C Hamano
2017-08-23 21:20         ` Brandon Casey
2017-08-23 21:54           ` Brandon Casey
2017-08-23 22:11             ` Brandon Casey
2017-08-24 16:52             ` Junio C Hamano
2017-08-24 18:29               ` Brandon Casey
2017-08-24 19:16                 ` Martin Ågren
2017-08-23 22:24           ` Junio C Hamano
2017-08-23 22:39             ` Brandon Casey
2017-08-21 17:43   ` [PATCH v2 4/4] ThreadSanitizer: add suppressions Martin Ågren
2017-08-25 17:04     ` Jeff King
2017-08-28 20:56   ` [PATCH v2 0/4] Some ThreadSanitizer-results Jeff Hostetler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cff383c2-ca57-caba-5a46-7dec4abc25a4@jeffhostetler.com \
    --to=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=jeffhost@microsoft.com \
    --cc=martin.agren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.