All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
To: Stephen Hemminger <stephen@networkplumber.org>,
	David Lamparter <equinox@diac24.net>
Cc: netdev@vger.kernel.org, roopa@cumulusnetworks.com,
	bridge@lists.linux-foundation.org, amine.kherbouche@6wind.com
Subject: Re: [RFC net-next v2] bridge lwtunnel, VPLS & NVGRE
Date: Tue, 22 Aug 2017 14:01:40 +0300	[thread overview]
Message-ID: <f96d548c-7a64-ae9a-d1ea-e9fd065f9303@cumulusnetworks.com> (raw)
In-Reply-To: <20170821170151.5b12a392@xeon-e3>

On 22/08/17 03:01, Stephen Hemminger wrote:
> On Mon, 21 Aug 2017 19:15:17 +0200
> David Lamparter <equinox@diac24.net> wrote:
> 
>> Hi all,
>>
>>
>> this is an update on the earlier "[RFC net-next] VPLS support".  Note
>> I've changed the subject lines on some of the patches to better reflect
>> what they really do (tbh the earlier subject lines were crap.)
>>
>> As previously, iproute2 / FRR patches are at:
>> - https://github.com/eqvinox/vpls-iproute2
>> - https://github.com/opensourcerouting/frr/commits/vpls
>> while this patchset is also available at:
>> - https://github.com/eqvinox/vpls-linux-kernel
>> (but please be aware that I'm amending and rebasing commits)
>>
>> The NVGRE implementation in the 3rd patch in this series is actually an
>> accident - I was just wiring up gretap as a reference;  only after I was
>> done I noticed that that sums up to NVGRE, more or less.  IMHO, it does
>> serve well to demonstrate the bridge changes are not VPLS-specific.
>>
>> To refer some notes from the first announce mail:
>>> I've tested some basic setups, the chain from LDP down into the kernel
>>> works at least in these.  FRR has some testcases around from OpenBSD
>>> VPLS support, I haven't wired that up to run against Linux / this
>>> patchset yet.  
>>
>> Same as before (API didn't change).
>>
>>> The patchset needs a lot of polishing (yes I left my TODO notes in the
>>> commit messages), for now my primary concern is overall design
>>> feedback.  Roopa has already provided a lot of input (Thanks!);  the
>>> major topic I'm expecting to get discussion on is the bridge FDB
>>> changes.  
>>
>> Got some useful input;  but still need feedback on the bridge FDB
>> changes (first 2 patches).  I don't believe it to have a significant
>> impact on existing bridge operation, and I believe a multipoint tunnel
>> driver without its own FDB (e.g. NVGRE in this set) should perform
>> better than one with its own FDB (e.g. existing VXLAN).
>>
>>> P.S.: For a little context on the bridge FDB changes - I'm hoping to
>>> find some time to extend this to the MDB to allow aggregating dst
>>> metadata and handing down a list of dst metas on TX.  This isn't
>>> specifically for VPLS but rather to give sufficient information to the
>>> 802.11 stack to allow it to optimize selecting rates (or unicasting)
>>> for multicast traffic by having the multicast subscriber list known.
>>> This is done by major commercial wifi solutions (e.g. google "dynamic
>>> multicast optimization".)  
>>
>> You can find hacks at this on:
>> https://github.com/eqvinox/vpls-linux-kernel/tree/mdb-hack
>> Please note that the patches in that branch are not at an acceptable
>> quality level, but you can see the semantic relation to 802.11.
>>
>> I would, however, like to point out that this branch has pseudo-working
>> IGMP/MLD snooping for VPLS, and it'd be 20-ish lines to add it to NVGRE
>> (I'll do that as soon as I get to it, it'll pop up on that branch too.)
>>
>> This is relevant to the discussion because it's a feature which is
>> non-obvious (to me) on how to do with the VXLAN model of having an
>> entirely separate FDB.  Meanwhile, with this architecture, the proof of
>> concept / hack is coming in at a measly cost of:
>> 8 files changed, 176 insertions(+), 15 deletions(-)
>>
>>
>> Cheers,
>>
>> -David
>>
>>
>> --- diffstat:
>> include/linux/netdevice.h      |  18 ++++++
>> include/net/dst_metadata.h     |  51 ++++++++++++++---
>> include/net/ip_tunnels.h       |   5 ++
>> include/uapi/linux/lwtunnel.h  |   8 +++
>> include/uapi/linux/neighbour.h |   2 +
>> include/uapi/linux/rtnetlink.h |   5 ++
>> net/bridge/br.c                |   2 +-
>> net/bridge/br_device.c         |   4 ++
>> net/bridge/br_fdb.c            | 119 ++++++++++++++++++++++++++++++++--------
>> net/bridge/br_input.c          |   6 +-
>> net/bridge/br_private.h        |   6 +-
>> net/core/lwtunnel.c            |   1 +
>> net/ipv4/ip_gre.c              |  40 ++++++++++++--
>> net/ipv4/ip_tunnel.c           |   1 +
>> net/ipv4/ip_tunnel_core.c      |  87 +++++++++++++++++++++++------
>> net/mpls/Kconfig               |  11 ++++
>> net/mpls/Makefile              |   1 +
>> net/mpls/af_mpls.c             | 113 ++++++++++++++++++++++++++++++++------
>> net/mpls/internal.h            |  44 +++++++++++++--
>> net/mpls/vpls.c                | 550 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> 20 files changed, 990 insertions(+), 84 deletions(-)
> 
> I know the bridge is an easy target to extend L2 forwarding, but it is not
> the only option. Have you condidered building a new driver (like VXLAN does)
> which does the forwarding you want. Having all features in one driver
> makes for worse performance, and increased complexity.
> 

+1

As I said before, a separate implementation will be much cleaner and will not affect
the bridge in any way, paying both performance and complexity price for something that
the majority of users will not be using isn't worth it.  In addition this creates a
silent dependency between the bridge and the fdb metadata dst users, it would be much
more preferable to be able to run them separately.
If there is any code that will need to be re-used by VPLS (or anyone else) figure out a way
to factor it out.

WARNING: multiple messages have this Message-ID (diff)
From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
To: Stephen Hemminger <stephen@networkplumber.org>,
	David Lamparter <equinox@diac24.net>
Cc: netdev@vger.kernel.org, roopa@cumulusnetworks.com,
	bridge@lists.linux-foundation.org, amine.kherbouche@6wind.com
Subject: Re: [Bridge] [RFC net-next v2] bridge lwtunnel, VPLS & NVGRE
Date: Tue, 22 Aug 2017 14:01:40 +0300	[thread overview]
Message-ID: <f96d548c-7a64-ae9a-d1ea-e9fd065f9303@cumulusnetworks.com> (raw)
In-Reply-To: <20170821170151.5b12a392@xeon-e3>

On 22/08/17 03:01, Stephen Hemminger wrote:
> On Mon, 21 Aug 2017 19:15:17 +0200
> David Lamparter <equinox@diac24.net> wrote:
> 
>> Hi all,
>>
>>
>> this is an update on the earlier "[RFC net-next] VPLS support".  Note
>> I've changed the subject lines on some of the patches to better reflect
>> what they really do (tbh the earlier subject lines were crap.)
>>
>> As previously, iproute2 / FRR patches are at:
>> - https://github.com/eqvinox/vpls-iproute2
>> - https://github.com/opensourcerouting/frr/commits/vpls
>> while this patchset is also available at:
>> - https://github.com/eqvinox/vpls-linux-kernel
>> (but please be aware that I'm amending and rebasing commits)
>>
>> The NVGRE implementation in the 3rd patch in this series is actually an
>> accident - I was just wiring up gretap as a reference;  only after I was
>> done I noticed that that sums up to NVGRE, more or less.  IMHO, it does
>> serve well to demonstrate the bridge changes are not VPLS-specific.
>>
>> To refer some notes from the first announce mail:
>>> I've tested some basic setups, the chain from LDP down into the kernel
>>> works at least in these.  FRR has some testcases around from OpenBSD
>>> VPLS support, I haven't wired that up to run against Linux / this
>>> patchset yet.  
>>
>> Same as before (API didn't change).
>>
>>> The patchset needs a lot of polishing (yes I left my TODO notes in the
>>> commit messages), for now my primary concern is overall design
>>> feedback.  Roopa has already provided a lot of input (Thanks!);  the
>>> major topic I'm expecting to get discussion on is the bridge FDB
>>> changes.  
>>
>> Got some useful input;  but still need feedback on the bridge FDB
>> changes (first 2 patches).  I don't believe it to have a significant
>> impact on existing bridge operation, and I believe a multipoint tunnel
>> driver without its own FDB (e.g. NVGRE in this set) should perform
>> better than one with its own FDB (e.g. existing VXLAN).
>>
>>> P.S.: For a little context on the bridge FDB changes - I'm hoping to
>>> find some time to extend this to the MDB to allow aggregating dst
>>> metadata and handing down a list of dst metas on TX.  This isn't
>>> specifically for VPLS but rather to give sufficient information to the
>>> 802.11 stack to allow it to optimize selecting rates (or unicasting)
>>> for multicast traffic by having the multicast subscriber list known.
>>> This is done by major commercial wifi solutions (e.g. google "dynamic
>>> multicast optimization".)  
>>
>> You can find hacks at this on:
>> https://github.com/eqvinox/vpls-linux-kernel/tree/mdb-hack
>> Please note that the patches in that branch are not at an acceptable
>> quality level, but you can see the semantic relation to 802.11.
>>
>> I would, however, like to point out that this branch has pseudo-working
>> IGMP/MLD snooping for VPLS, and it'd be 20-ish lines to add it to NVGRE
>> (I'll do that as soon as I get to it, it'll pop up on that branch too.)
>>
>> This is relevant to the discussion because it's a feature which is
>> non-obvious (to me) on how to do with the VXLAN model of having an
>> entirely separate FDB.  Meanwhile, with this architecture, the proof of
>> concept / hack is coming in at a measly cost of:
>> 8 files changed, 176 insertions(+), 15 deletions(-)
>>
>>
>> Cheers,
>>
>> -David
>>
>>
>> --- diffstat:
>> include/linux/netdevice.h      |  18 ++++++
>> include/net/dst_metadata.h     |  51 ++++++++++++++---
>> include/net/ip_tunnels.h       |   5 ++
>> include/uapi/linux/lwtunnel.h  |   8 +++
>> include/uapi/linux/neighbour.h |   2 +
>> include/uapi/linux/rtnetlink.h |   5 ++
>> net/bridge/br.c                |   2 +-
>> net/bridge/br_device.c         |   4 ++
>> net/bridge/br_fdb.c            | 119 ++++++++++++++++++++++++++++++++--------
>> net/bridge/br_input.c          |   6 +-
>> net/bridge/br_private.h        |   6 +-
>> net/core/lwtunnel.c            |   1 +
>> net/ipv4/ip_gre.c              |  40 ++++++++++++--
>> net/ipv4/ip_tunnel.c           |   1 +
>> net/ipv4/ip_tunnel_core.c      |  87 +++++++++++++++++++++++------
>> net/mpls/Kconfig               |  11 ++++
>> net/mpls/Makefile              |   1 +
>> net/mpls/af_mpls.c             | 113 ++++++++++++++++++++++++++++++++------
>> net/mpls/internal.h            |  44 +++++++++++++--
>> net/mpls/vpls.c                | 550 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> 20 files changed, 990 insertions(+), 84 deletions(-)
> 
> I know the bridge is an easy target to extend L2 forwarding, but it is not
> the only option. Have you condidered building a new driver (like VXLAN does)
> which does the forwarding you want. Having all features in one driver
> makes for worse performance, and increased complexity.
> 

+1

As I said before, a separate implementation will be much cleaner and will not affect
the bridge in any way, paying both performance and complexity price for something that
the majority of users will not be using isn't worth it.  In addition this creates a
silent dependency between the bridge and the fdb metadata dst users, it would be much
more preferable to be able to run them separately.
If there is any code that will need to be re-used by VPLS (or anyone else) figure out a way
to factor it out.



  parent reply	other threads:[~2017-08-22 11:01 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-21 17:15 [RFC net-next v2] bridge lwtunnel, VPLS & NVGRE David Lamparter
2017-08-21 17:15 ` [Bridge] " David Lamparter
2017-08-21 17:15 ` [PATCH 1/6] bridge: lwtunnel support in FDB David Lamparter
2017-08-21 17:15   ` [Bridge] " David Lamparter
2017-08-21 17:15 ` [PATCH 2/6] bridge: lwtunnel netlink interface David Lamparter
2017-08-21 17:15   ` [Bridge] " David Lamparter
2017-08-21 17:15 ` [PATCH 3/6] gretap: support lwtunnel under bridge (NVGRE) David Lamparter
2017-08-21 17:15   ` [Bridge] " David Lamparter
2017-08-21 17:15 ` [PATCH 4/6] mpls: split forwarding path on rx/tx boundary David Lamparter
2017-08-21 17:15   ` [Bridge] " David Lamparter
2017-08-21 17:15 ` [PATCH 5/6] mpls: add VPLS entry points David Lamparter
2017-08-21 17:15   ` [Bridge] " David Lamparter
2017-08-21 17:15 ` [PATCH 6/6] mpls: VPLS support David Lamparter
2017-08-21 17:15   ` [Bridge] " David Lamparter
2017-08-28  9:21   ` Amine Kherbouche
2017-08-28  9:21     ` [Bridge] " Amine Kherbouche
2017-08-22  0:01 ` [RFC net-next v2] bridge lwtunnel, VPLS & NVGRE Stephen Hemminger
2017-08-22  0:01   ` [Bridge] " Stephen Hemminger
2017-08-22  0:29   ` David Lamparter
2017-08-22  0:29     ` [Bridge] " David Lamparter
2017-08-22 11:01   ` Nikolay Aleksandrov [this message]
2017-08-22 11:01     ` Nikolay Aleksandrov
2017-08-22 11:32     ` David Lamparter
2017-08-22 11:32       ` [Bridge] " David Lamparter
2017-08-22 11:55       ` Nikolay Aleksandrov
2017-08-22 11:55         ` [Bridge] " Nikolay Aleksandrov
2017-08-22 12:06         ` David Lamparter
2017-08-22 12:06           ` [Bridge] " David Lamparter
2017-08-22  4:43 ` Roopa Prabhu
2017-08-22  4:43   ` [Bridge] " Roopa Prabhu
2017-08-22 11:24   ` David Lamparter
2017-08-22 11:24     ` [Bridge] " David Lamparter
2017-09-11  8:02 ` Amine Kherbouche
2017-09-11  8:02   ` [Bridge] " Amine Kherbouche
2017-09-19 14:46   ` Amine Kherbouche
2017-09-19 14:46     ` [Bridge] " Amine Kherbouche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f96d548c-7a64-ae9a-d1ea-e9fd065f9303@cumulusnetworks.com \
    --to=nikolay@cumulusnetworks.com \
    --cc=amine.kherbouche@6wind.com \
    --cc=bridge@lists.linux-foundation.org \
    --cc=equinox@diac24.net \
    --cc=netdev@vger.kernel.org \
    --cc=roopa@cumulusnetworks.com \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.