All of lore.kernel.org
 help / color / mirror / Atom feed
From: roopa <roopa@cumulusnetworks.com>
To: David Ahern <dsahern@gmail.com>
Cc: netdev@vger.kernel.org, ebiederm@xmission.com,
	Dinesh Dutt <ddutt@cumulusnetworks.com>,
	Vipin Kumar <vipin@cumulusnetworks.com>,
	Nicolas Dichtel <nicolas.dichtel@6wind.com>,
	hannes@stressinduktion.org
Subject: Re: [RFC PATCH 00/29] net: VRF support
Date: Thu, 05 Feb 2015 15:12:57 -0800	[thread overview]
Message-ID: <54D3F8F9.2060500@cumulusnetworks.com> (raw)
In-Reply-To: <1423100070-31848-1-git-send-email-dsahern@gmail.com>

On 2/4/15, 5:34 PM, David Ahern wrote:
> Kernel patches are also available here:
>      https://github.com/dsahern/linux.git vrf-3.19
>
> iproute2 patches are also available here:
>      https://github.com/dsahern/iproute2 vrf-3.19
>
>
> Background
> ----------
> The concept of VRFs (Virtual Routing and Forwarding) has been around for over
> 15 years. Support for VRFs in the Linux kernel has been an often requested
> feature for almost as long. For a while support was available via an out of
> tree patch [1]. Since network namespaces came along, the response to queries
> about VRF support for Linux was 'use namespaces'. But as mentioned previously
> [2] network namespaces are not a good match for VRFs. Of the list of problems
> noted the big one is that namespaces do not scale efficiently to the number
> of VRFs supported by networking gear (> 1000 VRFs). Networking vendors that
> want to use Linux as the OS have to carry custom solutions to this problem --
> be it userspace networking stacks, extensive kernel patches (to add VRF
> support or bend the implementation of namespaces), and/or patches to many
> open source components. The recent addition of switchdev support in the
> kernel suggests that people expect the use of Linux as a switch networking
> OS to increase. Hopefully the time is right to re-open the discussion on a
> salable VRF implementation for the Linux kernel.
>
> The intent of this RFC is to get feedback on the overall idea - namely VRFs
> as integer id and the nesting of VRFs within a namespace. This set includes
> changes only to core IPv4 code which shows the concept; changes to the rest
> of the network stack are fairly repetitive.
>
> This patch set has a number of similarities to the original VRF patch - most
> notably VRF ids as an integer index and plumbing through iproute2 and
> netlink. But this set is really a complete re-implementation of the feature,
> integrating VRF within a namespace and leveraging existing support for
> network namespaces.
>
> Design
> ------
> Namespaces provide excellent separation of the networking stack from the
> netdevices and up. The intent of VRFs is to provide an additional,
> logical separation at the L3 layer within a namespace.
>
>     +----------------------------------------------------------+
>     | Namespace foo                                            |
>     |                         +---------------+                |
>     |          +------+       | L3/L4 service |                |
>     |          | lldp |       |   (VRF any)   |                |
>     |          +------+       +---------------+                |
>     |                                                          |
>     |                             +-------------------------+  |
>     |                             | VRF M                   |  |
>     |  +---------------------+  +-------------------------+ |  |
>     |  | VRF 1 (default)     |  | VRF N                   | |  |
>     |  |  +---------------+  |  |    +---------------+    | |  |
>     |  |  | L3/L4 service |  |  |    | L3/L4 service |    | |  |
>     |  |  | (VRF unaware) |  |  |    | (VRF unaware) |    | |  |
>     |  |  +---------------+  |  |    +---------------+    | |  |
>     |  |                     |  |                         | |  |
>     |  |+-----+ +----------+ |  |  +-----+ +----------+   | |  |
>     |  || FIB | | neighbor | |  |  | FIB | | neighbor |   | |  |
>     |  |+-----+ +----------+ |  |  +-----+ +----------+   | |  |
>     |  |                     |  |                         |-+  |
>     |  | {dev 1}  {dev 2}    |  | {dev 3} {dev 4} {dev 5} |    |
>     |  +---------------------+  +-------------------------+    |
>     +----------------------------------------------------------+
>
> This is accomplished by enhancing the current namespace checks to a
> broader network context that is both a namepsace and a VRF id. The VRF
> id is a tag applied to relevant structures, an integer between 1 and 4095
> which allows for 4095 VRFs (could have 0 be the default VRF and then the
> range is 0-4095 = 4096s VRFs). (The limitation is arguably artificial. It
> is based on the genid scheme for versioning networking data which is a
> 32-bit integer. The VRF id is the lower 12 bits of the genid's.)
>
> Netdevices, sk_buffs, sockets, and tasks are all tagged with a VRF id.
> Network lookups (devices, sockets, addresses, routes, neighbors) require a
> match of both network namespace and VRF id (or the special 'vrf any' tag;
> more on that later).
>
>
David,

Wondering if you have thought about some of the the below cases in your 
approach to vrfs ?.
- Leaking routes from one vrf to another
- route lookup in one vrf on failure to fallback to the global vrf (This 
for example can be done using throw if we used ip rules and route tables 
to do the same).
- A route in one vrf pointing to a nexthop in another vrf

We have been playing with ip rules to implement vrfs. And the blocker 
today is that we cannot bind a socket to a vrf (routing tables in this 
case).

Thanks,
Roopa

  parent reply	other threads:[~2015-02-05 23:12 UTC|newest]

Thread overview: 119+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-05  1:34 [RFC PATCH 00/29] net: VRF support David Ahern
2015-02-05  1:34 ` [RFC PATCH 01/29] net: Introduce net_ctx and macro for context comparison David Ahern
2015-02-05  1:34 ` [RFC PATCH 02/29] net: Flip net_device to use net_ctx David Ahern
2015-02-05 13:47   ` Nicolas Dichtel
2015-02-06  0:45     ` David Ahern
2015-02-05  1:34 ` [RFC PATCH 03/29] net: Flip sock_common to net_ctx David Ahern
2015-02-05  1:34 ` [RFC PATCH 04/29] net: Add net_ctx macros for skbuffs David Ahern
2015-02-05  1:34 ` [RFC PATCH 05/29] net: Flip seq_net_private to net_ctx David Ahern
2015-02-05  1:34 ` [RFC PATCH 06/29] net: Flip fib_rules and fib_rules_ops to use net_ctx David Ahern
2015-02-05  1:34 ` [RFC PATCH 07/29] net: Flip inet_bind_bucket to net_ctx David Ahern
2015-02-05  1:34 ` [RFC PATCH 08/29] net: Flip fib_info " David Ahern
2015-02-05  1:34 ` [RFC PATCH 09/29] net: Flip ip6_flowlabel " David Ahern
2015-02-05  1:34 ` [RFC PATCH 10/29] net: Flip neigh structs " David Ahern
2015-02-05  1:34 ` [RFC PATCH 11/29] net: Flip nl_info " David Ahern
2015-02-05  1:34 ` [RFC PATCH 12/29] net: Add device lookups by net_ctx David Ahern
2015-02-05  1:34 ` [RFC PATCH 13/29] net: Convert function arg from struct net to struct net_ctx David Ahern
2015-02-05  1:34 ` [RFC PATCH 14/29] net: vrf: Introduce vrf header file David Ahern
2015-02-05 13:44   ` Nicolas Dichtel
2015-02-06  0:52     ` David Ahern
2015-02-06  8:53       ` Nicolas Dichtel
2015-02-05  1:34 ` [RFC PATCH 15/29] net: vrf: Add vrf to net_ctx struct David Ahern
2015-02-05  1:34 ` [RFC PATCH 16/29] net: vrf: Set default vrf David Ahern
2015-02-05  1:34 ` [RFC PATCH 17/29] net: vrf: Add vrf context to task struct David Ahern
2015-02-05  1:34 ` [RFC PATCH 18/29] net: vrf: Plumbing for vrf context on a socket David Ahern
2015-02-05 13:44   ` Nicolas Dichtel
2015-02-06  1:18     ` David Ahern
2015-02-05  1:34 ` [RFC PATCH 19/29] net: vrf: Add vrf context to skb David Ahern
2015-02-05 13:45   ` Nicolas Dichtel
2015-02-06  1:21     ` David Ahern
2015-02-06  3:54   ` Eric W. Biederman
2015-02-06  6:00     ` David Ahern
2015-02-05  1:34 ` [RFC PATCH 20/29] net: vrf: Add vrf context to flow struct David Ahern
2015-02-05  1:34 ` [RFC PATCH 21/29] net: vrf: Add vrf context to genid's David Ahern
2015-02-05  1:34 ` [RFC PATCH 22/29] net: vrf: Set VRF id in various network structs David Ahern
2015-02-05  1:34 ` [RFC PATCH 23/29] net: vrf: Enable vrf checks David Ahern
2015-02-05  1:34 ` [RFC PATCH 24/29] net: vrf: Add support to get/set vrf context on a device David Ahern
2015-02-05  1:34 ` [RFC PATCH 25/29] net: vrf: Handle VRF any context David Ahern
2015-02-05 13:46   ` Nicolas Dichtel
2015-02-06  1:23     ` David Ahern
2015-02-05  1:34 ` [RFC PATCH 26/29] net: vrf: Change single_open_net to pass net_ctx David Ahern
2015-02-05  1:34 ` [RFC PATCH 27/29] net: vrf: Add vrf checks and context to ipv4 proc files David Ahern
2015-02-05  1:34 ` [RFC PATCH 28/29] iproute2: vrf: Add vrf subcommand David Ahern
2015-02-05  1:34 ` [RFC PATCH 29/29] iproute2: Add vrf option to ip link command David Ahern
2015-02-05  5:17 ` [RFC PATCH 00/29] net: VRF support roopa
2015-02-05 13:44 ` Nicolas Dichtel
2015-02-06  1:32   ` David Ahern
2015-02-06  8:53     ` Nicolas Dichtel
2015-02-05 23:12 ` roopa [this message]
2015-02-06  2:19   ` David Ahern
2015-02-09 16:38     ` roopa
2015-02-10 10:43     ` Derek Fawcus
2015-02-06  6:10   ` Shmulik Ladkani
2015-02-09 15:54     ` roopa
2015-02-11  7:42       ` Shmulik Ladkani
2015-02-06  1:33 ` Stephen Hemminger
2015-02-06  2:10   ` David Ahern
2015-02-06  4:14     ` Eric W. Biederman
2015-02-06  6:15       ` David Ahern
2015-02-06 15:08         ` Nicolas Dichtel
     [not found]         ` <87iofe7n1x.fsf@x220.int.ebiederm.org>
2015-02-09 20:48           ` Nicolas Dichtel
2015-02-11  4:14           ` David Ahern
2015-02-06 15:10 ` Nicolas Dichtel
2015-02-06 20:50 ` Eric W. Biederman
2015-02-09  0:36   ` David Ahern
2015-02-09 11:30     ` Derek Fawcus
     [not found]   ` <871tlxtbhd.fsf_-_@x220.int.ebiederm.org>
2015-02-11  2:55     ` network namespace bloat Eric Dumazet
2015-02-11  3:18       ` Eric W. Biederman
2015-02-19 19:49         ` David Miller
2015-03-09 18:22           ` [PATCH net-next 0/6] tcp_metrics: Network namespace bloat reduction Eric W. Biederman
2015-03-09 18:27             ` [PATCH net-next 1/6] tcp_metrics: panic when tcp_metrics can not be allocated Eric W. Biederman
2015-03-09 18:50               ` Sergei Shtylyov
2015-03-11 19:22                 ` Sergei Shtylyov
2015-03-09 18:27             ` [PATCH net-next 2/6] tcp_metrics: Mix the network namespace into the hash function Eric W. Biederman
2015-03-09 18:29             ` [PATCH net-next 3/6] tcp_metrics: Add a field tcpm_net and verify it matches on lookup Eric W. Biederman
2015-03-09 20:25               ` Julian Anastasov
2015-03-10  6:59                 ` Eric W. Biederman
2015-03-10  8:23                   ` Julian Anastasov
2015-03-11  0:58                     ` Eric W. Biederman
2015-03-10 16:36                   ` David Miller
2015-03-10 17:06                     ` Eric W. Biederman
2015-03-10 17:29                       ` David Miller
2015-03-10 17:56                         ` Eric W. Biederman
2015-03-09 18:30             ` [PATCH net-next 4/6] tcp_metrics: Remove the unused return code from tcp_metrics_flush_all Eric W. Biederman
2015-03-09 18:30             ` [PATCH net-next 5/6] tcp_metrics: Rewrite tcp_metrics_flush_all Eric W. Biederman
2015-03-09 18:31             ` [PATCH net-next 6/6] tcp_metrics: Use a single hash table for all network namespaces Eric W. Biederman
2015-03-09 18:43               ` Eric Dumazet
2015-03-09 18:47               ` Eric Dumazet
2015-03-09 19:35                 ` Eric W. Biederman
2015-03-09 20:21                   ` Eric Dumazet
2015-03-09 20:09             ` [PATCH net-next 0/6] tcp_metrics: Network namespace bloat reduction David Miller
2015-03-09 20:21               ` Eric W. Biederman
2015-03-11 16:33             ` [PATCH net-next 0/8] tcp_metrics: Network namespace bloat reduction v2 Eric W. Biederman
2015-03-11 16:35               ` [PATCH net-next 1/8] net: Kill hold_net release_net Eric W. Biederman
2015-03-11 16:55                 ` Eric Dumazet
2015-03-11 17:34                   ` Eric W. Biederman
2015-03-11 17:07                 ` Eric Dumazet
2015-03-11 17:08                   ` Eric Dumazet
2015-03-11 17:10                 ` Eric Dumazet
2015-03-11 17:36                   ` Eric W. Biederman
2015-03-11 16:36               ` [PATCH net-next 2/8] net: Introduce possible_net_t Eric W. Biederman
2015-03-11 16:38               ` [PATCH net-next 3/8] tcp_metrics: panic when tcp_metrics_init fails Eric W. Biederman
2015-03-11 16:38               ` [PATCH net-next 4/8] tcp_metrics: Mix the network namespace into the hash function Eric W. Biederman
2015-03-11 16:40               ` [PATCH net-next 5/8] tcp_metrics: Add a field tcpm_net and verify it matches on lookup Eric W. Biederman
2015-03-11 16:41               ` [PATCH net-next 6/8] tcp_metrics: Remove the unused return code from tcp_metrics_flush_all Eric W. Biederman
2015-03-11 16:43               ` [PATCH net-next 7/8] tcp_metrics: Rewrite tcp_metrics_flush_all Eric W. Biederman
2015-03-11 16:43               ` [PATCH net-next 8/8] tcp_metrics: Use a single hash table for all network namespaces Eric W. Biederman
2015-03-13  5:04               ` [PATCH net-next 0/6] tcp_metrics: Network namespace bloat reduction v3 Eric W. Biederman
2015-03-13  5:04                 ` [PATCH net-next 1/6] tcp_metrics: panic when tcp_metrics_init fails Eric W. Biederman
2015-03-13  5:05                 ` [PATCH net-next 2/6] tcp_metrics: Mix the network namespace into the hash function Eric W. Biederman
2015-03-13  5:05                 ` [PATCH net-next 3/6] tcp_metrics: Add a field tcpm_net and verify it matches on lookup Eric W. Biederman
2015-03-13  5:06                 ` [PATCH net-next 4/6] tcp_metrics: Remove the unused return code from tcp_metrics_flush_all Eric W. Biederman
2015-03-13  5:07                 ` [PATCH net-next 5/6] tcp_metrics: Rewrite tcp_metrics_flush_all Eric W. Biederman
2015-03-13  5:07                 ` [PATCH net-next 6/6] tcp_metrics: Use a single hash table for all network namespaces Eric W. Biederman
2015-03-13  5:57                 ` [PATCH net-next 0/6] tcp_metrics: Network namespace bloat reduction v3 David Miller
2015-02-11 17:09     ` network namespace bloat Nicolas Dichtel
2015-02-10  0:53 ` [RFC PATCH 00/29] net: VRF support Thomas Graf
2015-02-10 20:54   ` David Ahern
2016-05-25 16:04 ` Chenna
2016-05-25 19:04   ` David Ahern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54D3F8F9.2060500@cumulusnetworks.com \
    --to=roopa@cumulusnetworks.com \
    --cc=ddutt@cumulusnetworks.com \
    --cc=dsahern@gmail.com \
    --cc=ebiederm@xmission.com \
    --cc=hannes@stressinduktion.org \
    --cc=netdev@vger.kernel.org \
    --cc=nicolas.dichtel@6wind.com \
    --cc=vipin@cumulusnetworks.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.