From: Eric Dumazet <edumazet@google.com>
To: Samuel Mendoza-Jonas <samjonas@amazon.com>
Cc: netdev@vger.kernel.org, Stewart Smith <trawets@amazon.com>,
"David S . Miller" <davem@davemloft.net>,
David Ahern <dsahern@kernel.org>,
Jakub Kicinski <kuba@kernel.org>,
Paolo Abeni <pabeni@redhat.com>,
linux-kernel@vger.kernel.org, benh@amazon.com,
stable@vger.kernel.org
Subject: Re: [PATCH net] net/ipv6: Reduce chance of collisions in inet6_hashfn()
Date: Thu, 29 Jun 2023 12:05:29 +0200 [thread overview]
Message-ID: <CANn89i+6d9K1VwNK1Joc-Yb_4jAfV_YFzk=z_K2_Oy+xJHSn_g@mail.gmail.com> (raw)
In-Reply-To: <20230629015844.800280-1-samjonas@amazon.com>
On Thu, Jun 29, 2023 at 3:59 AM Samuel Mendoza-Jonas
<samjonas@amazon.com> wrote:
>
> From: Stewart Smith <trawets@amazon.com>
>
> For both IPv4 and IPv6 incoming TCP connections are tracked in a hash
> table with a hash over the source & destination addresses and ports.
> However, the IPv6 hash is insufficient and can lead to a high rate of
> collisions.
>
> The IPv6 hash used an XOR to fit everything into the 96 bits for the
> fast jenkins hash, meaning it is possible for an external entity to
> ensure the hash collides, thus falling back to a linear search in the
> bucket, which is slow.
>
> We take the approach of hash half the data; hash the other half; and
> then hash them together. We do this with 3x jenkins hashes rather than
> 2x to calculate the hashing value for the connection covering the full
> length of the addresses and ports.
>
...
> While this may look like it adds overhead, the reality of modern CPUs
> means that this is unmeasurable in real world scenarios.
>
> In simulating with llvm-mca, the increase in cycles for the hashing code
> was ~5 cycles on Skylake (from a base of ~50), and an extra ~9 on
> Nehalem (base of ~62).
>
> In commit dd6d2910c5e0 ("netfilter: conntrack: switch to siphash")
> netfilter switched from a jenkins hash to a siphash, but even the faster
> hsiphash is a more significant overhead (~20-30%) in some preliminary
> testing. So, in this patch, we keep to the more conservative approach to
> ensure we don't add much overhead per SYN.
>
> In testing, this results in a consistently even spread across the
> connection buckets. In both testing and real-world scenarios, we have
> not found any measurable performance impact.
>
> Cc: stable@vger.kernel.org
> Fixes: 08dcdbf6a7b9 ("ipv6: use a stronger hash for tcp")
> Fixes: b3da2cf37c5c ("[INET]: Use jhash + random secret for ehash.")
> Signed-off-by: Stewart Smith <trawets@amazon.com>
> Signed-off-by: Samuel Mendoza-Jonas <samjonas@amazon.com>
> ---
> include/net/ipv6.h | 4 +---
> net/ipv6/inet6_hashtables.c | 5 ++++-
> 2 files changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/include/net/ipv6.h b/include/net/ipv6.h
> index 7332296eca44..f9bb54869d82 100644
> --- a/include/net/ipv6.h
> +++ b/include/net/ipv6.h
> @@ -752,9 +752,7 @@ static inline u32 ipv6_addr_hash(const struct in6_addr *a)
> /* more secured version of ipv6_addr_hash() */
> static inline u32 __ipv6_addr_jhash(const struct in6_addr *a, const u32 initval)
> {
> - u32 v = (__force u32)a->s6_addr32[0] ^ (__force u32)a->s6_addr32[1];
> -
> - return jhash_3words(v,
> + return jhash_3words((__force u32)a->s6_addr32[1],
> (__force u32)a->s6_addr32[2],
> (__force u32)a->s6_addr32[3],
> initval);
Hmmm... see my following comment.
> diff --git a/net/ipv6/inet6_hashtables.c b/net/ipv6/inet6_hashtables.c
> index b64b49012655..bb7198081974 100644
> --- a/net/ipv6/inet6_hashtables.c
> +++ b/net/ipv6/inet6_hashtables.c
> @@ -33,7 +33,10 @@ u32 inet6_ehashfn(const struct net *net,
> net_get_random_once(&inet6_ehash_secret, sizeof(inet6_ehash_secret));
> net_get_random_once(&ipv6_hash_secret, sizeof(ipv6_hash_secret));
>
> - lhash = (__force u32)laddr->s6_addr32[3];
> + lhash = jhash_3words((__force u32)laddr->s6_addr32[3],
> + (((u32)lport) << 16) | (__force u32)fport,
> + (__force u32)faddr->s6_addr32[0],
> + ipv6_hash_secret);
This seems wrong to me.
Reusing ipv6_hash_secret and other keys twice is not good, I am sure
some security researchers
would love this...
Please just change __ipv6_addr_jhash(), so that all users can benefit
from a more secure version ?
It also leaves lhash / fhash names relevant here.
We will probably have to switch to sip (or other stronger hash than
jhash) at some point, it is a tradeoff.
We might also add a break in the loop when a bucket exceeds a given
safety length,
because attackers can eventually exploit hashes after some point.
The following patch looks much saner to me.
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 7332296eca44b84dca1bbecb545f6824a0e8ed3d..2acc4c808d45d1c1bb1c5076e79842e136203e4c
100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -752,12 +752,8 @@ static inline u32 ipv6_addr_hash(const struct in6_addr *a)
/* more secured version of ipv6_addr_hash() */
static inline u32 __ipv6_addr_jhash(const struct in6_addr *a, const
u32 initval)
{
- u32 v = (__force u32)a->s6_addr32[0] ^ (__force u32)a->s6_addr32[1];
-
- return jhash_3words(v,
- (__force u32)a->s6_addr32[2],
- (__force u32)a->s6_addr32[3],
- initval);
+ return jhash2((__force const u32 *)a->s6_addr32,
+ ARRAY_SIZE(a->s6_addr32), initval);
}
next prev parent reply other threads:[~2023-06-29 10:05 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-29 1:58 [PATCH net] net/ipv6: Reduce chance of collisions in inet6_hashfn() Samuel Mendoza-Jonas
2023-06-29 10:05 ` Eric Dumazet [this message]
2023-07-01 0:26 ` Smith, Stewart
2023-07-01 0:26 ` Smith, Stewart
2023-07-01 2:00 ` Eric Dumazet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CANn89i+6d9K1VwNK1Joc-Yb_4jAfV_YFzk=z_K2_Oy+xJHSn_g@mail.gmail.com' \
--to=edumazet@google.com \
--cc=benh@amazon.com \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=samjonas@amazon.com \
--cc=stable@vger.kernel.org \
--cc=trawets@amazon.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).