From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
To: David Lamparter <equinox@diac24.net>
Cc: netdev@vger.kernel.org, roopa@cumulusnetworks.com,
"bridge@lists.linux-foundation.org"
<bridge@lists.linux-foundation.org>,
amine.kherbouche@6wind.com
Subject: Re: [PATCH 1/6] bridge: learn dst metadata in FDB
Date: Thu, 17 Aug 2017 15:19:59 +0300 [thread overview]
Message-ID: <c904f967-a536-5398-dc48-8adb4866ef51@cumulusnetworks.com> (raw)
In-Reply-To: <20170817121020.GL773745@eidolon>
On 17/08/17 15:10, David Lamparter wrote:
> On Thu, Aug 17, 2017 at 02:51:12PM +0300, Nikolay Aleksandrov wrote:
>> On 17/08/17 14:39, Nikolay Aleksandrov wrote:
>>> On 17/08/17 14:03, David Lamparter wrote:
>>>> On Wed, Aug 16, 2017 at 11:38:06PM +0300, Nikolay Aleksandrov wrote:
> [cut]
>>>>> and hitting the fast path for everyone in a few different places for a
>>>>> feature that the majority will not use does not sound acceptable to
>>>>> me. We've been trying hard to optimize it, trying to avoid additional
>>>>> cache lines, removing tests and keeping special cases to a minimum.
>>>>
>>>> skb->dst is on the same cacheline as skb->len.
>>>> fdb->md_dst is on the same cacheline as fdb->dst.
>>>> Both will be 0 in a lot of cases, so this should be two null checks on
>>>> data that is hot in the cache. Are you sure this is an actual problem?
>>>>
>>>
>>> Sure - no, I haven't benchmarked it, but I don't see skb->len being on
>>> the same cache line as _skb_refdst assuming 64 byte cache lines.
>>
>> I should've been clearer - that obviously depends on the kernel config, but
>> in order for them to be in the same line you need to disable either one of
>> conntrack, bridge_netfilter or xfrm, these are almost always enabled (at
>> least in all major distributions).
>
> Did I miscount somewhere? This is what I counted:
> offs size
> 00 16 next/prev/other union bits
> 16 8 sk
> 24 8 dev
> 32 32 cb (first 32 bytes)
> ---- boundary @ 64
> 64 16 cb (last 16 bytes)
> 80 8 _skb_refdst
> 88 8 destructor
> 96 8 (0) sp
> 104 8 (0) _nfct
> 112 8 (0) nf_bridge
> 120 4 len
> 124 4 data_len
> ---- boundary @ 128
> 128 2 mac_len
> 130 2 hdr_len
>
>
> -David
>
What kernel ?
pahole -C sk_buff
struct sk_buff {
union {
struct {
struct sk_buff * next; /* 0 8 */
struct sk_buff * prev; /* 8 8 */
union {
ktime_t tstamp; /* 8 */
u64 skb_mstamp; /* 8 */
}; /* 16 8 */
}; /* 24 */
struct rb_node rbnode; /* 24 */
}; /* 0 24 */
struct sock * sk; /* 24 8 */
union {
struct net_device * dev; /* 8 */
long unsigned int dev_scratch; /* 8 */
}; /* 32 8 */
char cb[48]; /* 40 48 */
/* --- cacheline 1 boundary (64 bytes) was 24 bytes ago --- */
long unsigned int _skb_refdst; /* 88 8 */
void (*destructor)(struct sk_buff *); /* 96 8 */
struct sec_path * sp; /* 104 8 */
long unsigned int _nfct; /* 112 8 */
struct nf_bridge_info * nf_bridge; /* 120 8 */
/* --- cacheline 2 boundary (128 bytes) --- */
unsigned int len; /* 128 4 */
unsigned int data_len; /* 132 4 */
WARNING: multiple messages have this Message-ID (diff)
From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
To: David Lamparter <equinox@diac24.net>
Cc: netdev@vger.kernel.org, roopa@cumulusnetworks.com,
"bridge@lists.linux-foundation.org"
<bridge@lists.linux-foundation.org>,
amine.kherbouche@6wind.com
Subject: Re: [Bridge] [PATCH 1/6] bridge: learn dst metadata in FDB
Date: Thu, 17 Aug 2017 15:19:59 +0300 [thread overview]
Message-ID: <c904f967-a536-5398-dc48-8adb4866ef51@cumulusnetworks.com> (raw)
In-Reply-To: <20170817121020.GL773745@eidolon>
On 17/08/17 15:10, David Lamparter wrote:
> On Thu, Aug 17, 2017 at 02:51:12PM +0300, Nikolay Aleksandrov wrote:
>> On 17/08/17 14:39, Nikolay Aleksandrov wrote:
>>> On 17/08/17 14:03, David Lamparter wrote:
>>>> On Wed, Aug 16, 2017 at 11:38:06PM +0300, Nikolay Aleksandrov wrote:
> [cut]
>>>>> and hitting the fast path for everyone in a few different places for a
>>>>> feature that the majority will not use does not sound acceptable to
>>>>> me. We've been trying hard to optimize it, trying to avoid additional
>>>>> cache lines, removing tests and keeping special cases to a minimum.
>>>>
>>>> skb->dst is on the same cacheline as skb->len.
>>>> fdb->md_dst is on the same cacheline as fdb->dst.
>>>> Both will be 0 in a lot of cases, so this should be two null checks on
>>>> data that is hot in the cache. Are you sure this is an actual problem?
>>>>
>>>
>>> Sure - no, I haven't benchmarked it, but I don't see skb->len being on
>>> the same cache line as _skb_refdst assuming 64 byte cache lines.
>>
>> I should've been clearer - that obviously depends on the kernel config, but
>> in order for them to be in the same line you need to disable either one of
>> conntrack, bridge_netfilter or xfrm, these are almost always enabled (at
>> least in all major distributions).
>
> Did I miscount somewhere? This is what I counted:
> offs size
> 00 16 next/prev/other union bits
> 16 8 sk
> 24 8 dev
> 32 32 cb (first 32 bytes)
> ---- boundary @ 64
> 64 16 cb (last 16 bytes)
> 80 8 _skb_refdst
> 88 8 destructor
> 96 8 (0) sp
> 104 8 (0) _nfct
> 112 8 (0) nf_bridge
> 120 4 len
> 124 4 data_len
> ---- boundary @ 128
> 128 2 mac_len
> 130 2 hdr_len
>
>
> -David
>
What kernel ?
pahole -C sk_buff
struct sk_buff {
union {
struct {
struct sk_buff * next; /* 0 8 */
struct sk_buff * prev; /* 8 8 */
union {
ktime_t tstamp; /* 8 */
u64 skb_mstamp; /* 8 */
}; /* 16 8 */
}; /* 24 */
struct rb_node rbnode; /* 24 */
}; /* 0 24 */
struct sock * sk; /* 24 8 */
union {
struct net_device * dev; /* 8 */
long unsigned int dev_scratch; /* 8 */
}; /* 32 8 */
char cb[48]; /* 40 48 */
/* --- cacheline 1 boundary (64 bytes) was 24 bytes ago --- */
long unsigned int _skb_refdst; /* 88 8 */
void (*destructor)(struct sk_buff *); /* 96 8 */
struct sec_path * sp; /* 104 8 */
long unsigned int _nfct; /* 112 8 */
struct nf_bridge_info * nf_bridge; /* 120 8 */
/* --- cacheline 2 boundary (128 bytes) --- */
unsigned int len; /* 128 4 */
unsigned int data_len; /* 132 4 */
next prev parent reply other threads:[~2017-08-17 12:19 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-16 17:01 [RFC net-next] VPLS support David Lamparter
2017-08-16 17:01 ` [PATCH 1/6] bridge: learn dst metadata in FDB David Lamparter
2017-08-16 20:38 ` Nikolay Aleksandrov
2017-08-16 20:38 ` [Bridge] " Nikolay Aleksandrov
2017-08-17 11:03 ` David Lamparter
2017-08-17 11:03 ` [Bridge] " David Lamparter
2017-08-17 11:39 ` Nikolay Aleksandrov
2017-08-17 11:39 ` [Bridge] " Nikolay Aleksandrov
2017-08-17 11:51 ` Nikolay Aleksandrov
2017-08-17 11:51 ` [Bridge] " Nikolay Aleksandrov
2017-08-17 12:10 ` David Lamparter
2017-08-17 12:10 ` [Bridge] " David Lamparter
2017-08-17 12:19 ` Nikolay Aleksandrov [this message]
2017-08-17 12:19 ` Nikolay Aleksandrov
2017-08-17 12:20 ` David Lamparter
2017-08-17 12:20 ` [Bridge] " David Lamparter
2017-08-17 12:45 ` David Lamparter
2017-08-17 12:45 ` [Bridge] " David Lamparter
2017-08-17 13:04 ` Nikolay Aleksandrov
2017-08-17 13:04 ` [Bridge] " Nikolay Aleksandrov
2017-08-17 16:16 ` David Lamparter
2017-08-17 16:16 ` [Bridge] " David Lamparter
2017-08-16 17:01 ` [PATCH 2/6] mpls: split forwarding path on rx/tx boundary David Lamparter
2017-08-19 17:10 ` kbuild test robot
2017-08-19 17:42 ` kbuild test robot
2017-08-16 17:01 ` [PATCH 3/6] mpls: add VPLS entry points David Lamparter
2017-08-19 18:27 ` kbuild test robot
2017-08-21 14:01 ` Amine Kherbouche
2017-08-21 15:55 ` David Lamparter
2017-08-21 16:13 ` Amine Kherbouche
2017-08-16 17:02 ` [PATCH 4/6] mpls: VPLS support David Lamparter
2017-08-21 15:14 ` Amine Kherbouche
2017-08-21 16:18 ` David Lamparter
2017-08-21 16:11 ` Amine Kherbouche
2017-08-16 17:02 ` [PATCH 5/6] bridge: add VPLS pseudowire info in fdb dump David Lamparter
2017-08-16 17:02 ` [PATCH 6/6] mpls: pseudowire control word support David Lamparter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c904f967-a536-5398-dc48-8adb4866ef51@cumulusnetworks.com \
--to=nikolay@cumulusnetworks.com \
--cc=amine.kherbouche@6wind.com \
--cc=bridge@lists.linux-foundation.org \
--cc=equinox@diac24.net \
--cc=netdev@vger.kernel.org \
--cc=roopa@cumulusnetworks.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.