All of lore.kernel.org
 help / color / mirror / Atom feed
From: Herbert Xu <herbert@gondor.apana.org.au>
To: "David S. Miller" <davem@davemloft.net>,
	Thomas Graf <tgraf@suug.ch>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	Patrick McHardy <kaber@trash.net>,
	Josh Triplett <josh@joshtriplett.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	netdev@vger.kernel.org
Subject: [v2 PATCH 0/10] rhashtable: Multiple rehashing
Date: Sun, 22 Mar 2015 19:03:30 +1100	[thread overview]
Message-ID: <20150322080330.GA3416@gondor.apana.org.au> (raw)

Hi:                                           

This series introduces multiple rehashing.

Recall that the original implementation in br_multicast used
two list pointers per hash node and therefore is limited to at
most one rehash at a time since you need one list pointer for
the old table and one for the new table.

Thanks to Josh Triplett's suggestion of using a single list pointer
we're no longer limited by that.  So it is perfectly OK to have
an arbitrary number of tables in existence at any one time.

The reader and removal simply has to walk from the oldest table
to the newest table in order not to miss anything.  Insertion
without lookup are just as easy as we simply go to the last table
that we can find and add the entry there.

However, insertion with uniqueness lookup is more complicated
because we need to ensure that two simultaneous insertions of the
same key do not both succeed.  To achieve this, all insertions
including those without lookups are required to obtain the bucket
lock from the oldest hash table that is still alive.  This is
determined by having the rehasher (there is only one rehashing
thread in the system) keep a pointer of where it is up to.  If
a bucket has already been rehashed then it is dead, i.e., there
cannot be any more insertions to it, otherwise it is considered
alive.  This guarantees that the same key cannot be inserted
in two different tables in parallel.

Patch 1 is actually a bug fix for the walker.

Patch 2-6 eliminates unnecessary out-of-line copies of jhash.

Patch 7 disables automatic shrinking so now shrinking is only
possible if requested by the user.

Patch 8 introduces multiple rehashing.  This means that if we
decide to grow then we will grow regardless of whether the previous
one has finished.  However, this is still asynchronous meaning
that if insertions come fast enough we may still end up with a
table that is overutilised.

Patch 9 adds support for GFP_ATOMIC allocations of struct bucket_table.

Finally patch 10 enables immediate rehashing.  This is done either
when the table reaches 100% utilisation, or when the chain length
exceeds 16 (the latter can be disabled on request, e.g., for
nft_hash.

With these patches the system should no longer have any trouble
dealing with fast insertions on a small table.  In the worst
case you end up with a list of tables that's log N in length
while the rehasher catches up.

v2 fixes the blank subject of patch 5 and the prevents vzalloc
for GFP_ATOMIC callers in patch 9.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

             reply	other threads:[~2015-03-22  8:03 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-22  8:03 Herbert Xu [this message]
2015-03-22  8:03 ` [v2 PATCH 1/10] rhashtable: Add barrier to ensure we see new tables in walker Herbert Xu
2015-03-22 10:47   ` Thomas Graf
2015-03-22  8:04 ` [v2 PATCH 2/10] rhashtable: Eliminate unnecessary branch in rht_key_hashfn Herbert Xu
2015-03-22 11:07   ` Thomas Graf
2015-03-22  8:04 ` [v2 PATCH 3/10] rhashtable: Allow hashfn to be unset Herbert Xu
2015-03-22 11:55   ` Thomas Graf
2015-03-22 12:04     ` Herbert Xu
2015-03-22 12:32       ` Thomas Graf
2015-03-22 21:12         ` Herbert Xu
2015-03-23  9:58           ` Thomas Graf
2015-03-23 10:18             ` Herbert Xu
2015-03-23 14:29   ` David Laight
2015-03-22  8:04 ` [v2 PATCH 4/10] netlink: Use default rhashtable hashfn Herbert Xu
2015-03-22 11:55   ` Thomas Graf
2015-03-23  1:18   ` Simon Horman
2015-03-23 12:56     ` Herbert Xu
2015-03-22  8:04 ` [v2 PATCH 5/10] tipc: " Herbert Xu
2015-03-22 11:56   ` Thomas Graf
2015-03-22  8:04 ` [v2 PATCH 6/10] netfilter: " Herbert Xu
2015-03-22 11:56   ` Thomas Graf
2015-03-23 11:32     ` Herbert Xu
2015-03-22  8:04 ` [v2 PATCH 7/10] rhashtable: Disable automatic shrinking Herbert Xu
2015-03-22 12:17   ` Thomas Graf
2015-03-22 13:06     ` Thomas Graf
2015-03-23  0:07       ` Herbert Xu
2015-03-23  8:37         ` Thomas Graf
2015-03-23  9:29           ` Herbert Xu
2015-03-23  9:43             ` Thomas Graf
2015-03-23  0:09     ` Herbert Xu
2015-03-23  8:33       ` Thomas Graf
2015-03-23  9:28         ` Herbert Xu
2015-03-23  9:36           ` Thomas Graf
2015-03-23  9:39             ` Herbert Xu
2015-03-23  9:44               ` Herbert Xu
2015-03-23 10:08                 ` Thomas Graf
2015-03-23 10:19                   ` Herbert Xu
2015-03-23 16:45           ` David Miller
2015-03-23 16:44         ` David Miller
2015-03-23 21:48           ` Herbert Xu
2015-03-23 22:13           ` Thomas Graf
2015-03-22  8:04 ` [v2 PATCH 8/10] rhashtable: Add multiple rehash support Herbert Xu
2015-03-22  8:04 ` [v2 PATCH 9/10] rhashtable: Allow GFP_ATOMIC bucket table allocation Herbert Xu
2015-03-22  8:04 ` [v2 PATCH 10/10] rhashtable: Add immediate rehash during insertion Herbert Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150322080330.GA3416@gondor.apana.org.au \
    --to=herbert@gondor.apana.org.au \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=kaber@trash.net \
    --cc=netdev@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=tgraf@suug.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.