All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 00/22 v2] Lightweight & flow based encapsulation
@ 2015-07-21  8:43 Thomas Graf
  2015-07-21  8:43 ` [PATCH net-next 01/22] rtnetlink: introduce new RTA_ENCAP_TYPE and RTA_ENCAP attributes Thomas Graf
                   ` (17 more replies)
  0 siblings, 18 replies; 29+ messages in thread
From: Thomas Graf @ 2015-07-21  8:43 UTC (permalink / raw)
  To: roopa-qUQiAmfTcIp+XZJcv9eMoEEOCMrvLtNR,
	rshearma-43mecJUBy8ZBDgjK7y7TUQ, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	hannes-tFNcAqjVMyqKXQKiL6tip0B+6BGkLq7r,
	pshelar-l0M0P4e3n4LQT0dZR+AlfA, jesse-l0M0P4e3n4LQT0dZR+AlfA,
	davem-fT/PcQaiUtIeIZ0/mPfg9Q, daniel-FeC+5ew28dpmcu3hnIyYJQ,
	tom-BjP2VixgY4xUbtYUoyoikg, edumazet-hpIqsD4AKlfQT0dZR+AlfA,
	jiri-rHqAuBHg3fBzbRFIqnYvSA,
	marcelo.leitner-Re5JQEeQqe8AvxtiuMwx3w,
	stephen-OTpzqLSitTUnbdJkjeBofR2eb7JE58TQ,
	jpettit-l0M0P4e3n4LQT0dZR+AlfA, kaber-dcUjhNyLwpNeoWH0uzbU5w,
	simon.horman-wFxRvT7yatFl57MIdRCFDg,
	joestringer-l0M0P4e3n4LQT0dZR+AlfA, ja-FgGsKACvmQM,
	ast-uqk4Ao+rVK5Wk0Htik3J/w, weichunc-uqk4Ao+rVK5Wk0Htik3J/w
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA

This series combines the work previously posted by Roopa, Robert and
myself. It's according to what we discussed at NFWS. The motivation
of this series is to:

 * Consolidate code between OVS and the rest of the kernel and get
   rid of OVS vports and instead represent them as pure net_devices.
 * Introduce a lightweight tunneling mechanism which enables flow
   based encapsulation to improve scalability on both RX and TX.
 * Do the above in an encapsulation unspecific way so that the
   encapsulation type is eventually abstracted away from the user.
 * Use the same forwarding decision for both native forwarding and
   encapsulation thus allowing to switch between native IPv6 and
   UDP encapsulation based on endpoint without requiring additional
   logic

The fundamental changes introduces in this series are:
 * A new RTA_ENCAP Netlink attribute for routes carrying encapsulation
   instructions. Depending on the specified type, the instructions
   apply to UDP encapsulations, MPLS and possible other in the future.
 * Depending on the encapsulation type, the output function of the
   dst is directly overwritten or the dst merely attaches metadata and
   relies on a subsequent net_device to apply it to the packet. The
   latter is typically used if an inner and outer IP header exist which
   require two subsequent routing lookups to be performed.
 * A new metadata_dst structure which can be attached to skbs to
   carry metadata in between subsystems. This new metadata transport
   is used to provide a single interface for VXLAN, routing and OVS
   to communicate through metadata.

The OVS interfaces remain as-is but will transparently create a real
VXLAN net_device in the background. iproute2 is extended with a new
use cases:

  VXLAN:
  ip route add 40.1.1.1/32 encap vxlan id 10 dst 50.1.1.2 dev vxlan0

  MPLS:
  ip route add 10.1.1.0/30 encap mpls 200 via inet 10.1.1.1 dev swp1

Performance implications:
  The additional memory allocation in the receive path should have
  performance implications although it is not observable in standard
  throughput tests if GRO is properly done. The correct net_device
  model outweights the additional cost of the allocation. Furthermore,
  this implication can be relaxed by reintroducing a direct unqueued
  path from a software device to a consumer like bridge or OVS if
  needed.

    $ netperf  -t TCP_STREAM -H 15.1.1.201
    MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
    15.1.1.201 (15.1.1.201) port 0 AF_INET : demo
    Recv   Send    Send
    Socket Socket  Message  Elapsed
    Size   Size    Size     Time     Throughput
    bytes  bytes   bytes    secs.    10^6bits/sec

     87380  16384  16384    10.00    9118.17

Changes since v1:
 * Properly initialize tun_id as reported by Julian
 * Drop dupliate netif_keep_dst() as reported by Alexei

Roopa Prabhu (9):
  rtnetlink: introduce new RTA_ENCAP_TYPE and RTA_ENCAP attributes
  lwtunnel: infrastructure for handling light weight tunnels like mpls
  ipv4: support for fib route lwtunnel encap attributes
  ipv6: support for fib route lwtunnel encap attributes
  lwtunnel: support dst output redirect function
  ipv4: redirect dst output to lwtunnel output
  ipv6: rt6_info output redirect to tunnel output
  mpls: export mpls functions for use by mpls iptunnels
  mpls: ip tunnel support

Thomas Graf (13):
  ip_tunnel: Make ovs_tunnel_info and ovs_key_ipv4_tunnel generic
  icmp: Don't leak original dst into ip_route_input()
  dst: Metadata destinations
  arp: Inherit metadata dst when creating ARP requests
  vxlan: Flow based tunneling
  route: Extend flow representation with tunnel key
  route: Per route IP tunnel metadata via lightweight tunnel
  fib: Add fib rule match on tunnel id
  vxlan: Factor out device configuration
  openvswitch: Make tunnel set action attach a metadata dst
  openvswitch: Move dev pointer into vport itself
  openvswitch: Abstract vport name through ovs_vport_name()
  openvswitch: Use regular VXLAN net_device device

 drivers/net/vxlan.c                  | 672 +++++++++++++++++++++--------------
 include/linux/lwtunnel.h             |   6 +
 include/linux/mpls_iptunnel.h        |   6 +
 include/linux/skbuff.h               |   1 +
 include/net/dst.h                    |   6 +-
 include/net/dst_metadata.h           |  55 +++
 include/net/fib_rules.h              |   1 +
 include/net/flow.h                   |   8 +
 include/net/ip6_fib.h                |   3 +
 include/net/ip_fib.h                 |   5 +-
 include/net/ip_tunnels.h             |  95 ++++-
 include/net/lwtunnel.h               | 144 ++++++++
 include/net/mpls_iptunnel.h          |  29 ++
 include/net/route.h                  |   1 +
 include/net/rtnetlink.h              |   1 +
 include/net/vxlan.h                  |  85 ++++-
 include/uapi/linux/fib_rules.h       |   2 +-
 include/uapi/linux/if_link.h         |   1 +
 include/uapi/linux/lwtunnel.h        |  16 +
 include/uapi/linux/mpls_iptunnel.h   |  28 ++
 include/uapi/linux/openvswitch.h     |   2 +-
 include/uapi/linux/rtnetlink.h       |  17 +
 net/Kconfig                          |   7 +
 net/core/Makefile                    |   1 +
 net/core/dev.c                       |   2 +-
 net/core/dst.c                       |  84 ++++-
 net/core/fib_rules.c                 |  24 +-
 net/core/lwtunnel.c                  | 235 ++++++++++++
 net/core/rtnetlink.c                 |  26 +-
 net/ipv4/arp.c                       |  65 ++--
 net/ipv4/fib_frontend.c              |  10 +
 net/ipv4/fib_semantics.c             |  96 ++++-
 net/ipv4/icmp.c                      |   1 +
 net/ipv4/ip_input.c                  |   3 +-
 net/ipv4/ip_tunnel_core.c            | 130 +++++++
 net/ipv4/route.c                     |  28 +-
 net/ipv6/ip6_fib.c                   |   2 +
 net/ipv6/route.c                     |  34 +-
 net/mpls/Kconfig                     |   8 +-
 net/mpls/Makefile                    |   1 +
 net/mpls/af_mpls.c                   |  11 +-
 net/mpls/internal.h                  |   9 +-
 net/mpls/mpls_iptunnel.c             | 233 ++++++++++++
 net/openvswitch/Kconfig              |  12 -
 net/openvswitch/Makefile             |   1 -
 net/openvswitch/actions.c            |  12 +-
 net/openvswitch/datapath.c           |  19 +-
 net/openvswitch/datapath.h           |   5 +-
 net/openvswitch/dp_notify.c          |   5 +-
 net/openvswitch/flow.c               |   4 +-
 net/openvswitch/flow.h               |  79 +---
 net/openvswitch/flow_netlink.c       |  84 ++++-
 net/openvswitch/flow_netlink.h       |   3 +-
 net/openvswitch/flow_table.c         |   4 +-
 net/openvswitch/vport-geneve.c       |  17 +-
 net/openvswitch/vport-gre.c          |  16 +-
 net/openvswitch/vport-internal_dev.c |  38 +-
 net/openvswitch/vport-netdev.c       | 289 ++++++++++++---
 net/openvswitch/vport-netdev.h       |  13 -
 net/openvswitch/vport-vxlan.c        | 322 -----------------
 net/openvswitch/vport-vxlan.h        |  11 -
 net/openvswitch/vport.c              |  34 +-
 net/openvswitch/vport.h              |  21 +-
 63 files changed, 2231 insertions(+), 952 deletions(-)
 create mode 100644 include/linux/lwtunnel.h
 create mode 100644 include/linux/mpls_iptunnel.h
 create mode 100644 include/net/dst_metadata.h
 create mode 100644 include/net/lwtunnel.h
 create mode 100644 include/net/mpls_iptunnel.h
 create mode 100644 include/uapi/linux/lwtunnel.h
 create mode 100644 include/uapi/linux/mpls_iptunnel.h
 create mode 100644 net/core/lwtunnel.c
 create mode 100644 net/mpls/mpls_iptunnel.c
 delete mode 100644 net/openvswitch/vport-vxlan.c
 delete mode 100644 net/openvswitch/vport-vxlan.h

-- 
2.4.3

_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

^ permalink raw reply	[flat|nested] 29+ messages in thread
* [PATCH net-next 00/22] Lightweight & flow based encapsulation
@ 2015-07-17 12:55 Thomas Graf
       [not found] ` <cover.1437137396.git.tgraf-G/eBtMaohhA@public.gmane.org>
  0 siblings, 1 reply; 29+ messages in thread
From: Thomas Graf @ 2015-07-17 12:55 UTC (permalink / raw)
  To: roopa-qUQiAmfTcIp+XZJcv9eMoEEOCMrvLtNR,
	rshearma-43mecJUBy8ZBDgjK7y7TUQ, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	hannes-tFNcAqjVMyqKXQKiL6tip0B+6BGkLq7r,
	pshelar-l0M0P4e3n4LQT0dZR+AlfA, jesse-l0M0P4e3n4LQT0dZR+AlfA,
	davem-fT/PcQaiUtIeIZ0/mPfg9Q, daniel-FeC+5ew28dpmcu3hnIyYJQ,
	tom-BjP2VixgY4xUbtYUoyoikg, edumazet-hpIqsD4AKlfQT0dZR+AlfA,
	jiri-rHqAuBHg3fBzbRFIqnYvSA,
	marcelo.leitner-Re5JQEeQqe8AvxtiuMwx3w,
	stephen-OTpzqLSitTUnbdJkjeBofR2eb7JE58TQ,
	jpettit-l0M0P4e3n4LQT0dZR+AlfA, kaber-dcUjhNyLwpNeoWH0uzbU5w,
	simon.horman-wFxRvT7yatFl57MIdRCFDg,
	joestringer-l0M0P4e3n4LQT0dZR+AlfA, ja-FgGsKACvmQM,
	ast-uqk4Ao+rVK5Wk0Htik3J/w, weichunc-uqk4Ao+rVK5Wk0Htik3J/w
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA

This series combines the work previously posted by Roopa, Robert and
myself. It's according to what we discussed at NFWS. The motivation
of this series is to:

 * Consolidate code between OVS and the rest of the kernel and get
   rid of OVS vports and instead represent them as pure net_devices.
 * Introduce a lightweight tunneling mechanism which enables flow
   based encapsulation to improve scalability on both RX and TX.
 * Do the above in an encapsulation unspecific way so that the
   encapsulation type is eventually abstracted away from the user.
 * Use the same forwarding decision for both native forwarding and
   encapsulation thus allowing to switch between native IPv6 and
   UDP encapsulation based on endpoint without requiring additional
   logic

The fundamental changes introduces in this series are:
 * A new RTA_ENCAP Netlink attribute for routes carrying encapsulation
   instructions. Depending on the specified type, the instructions
   apply to UDP encapsulations, MPLS and possible other in the future.
 * Depending on the encapsulation type, the output function of the
   dst is directly overwritten or the dst merely attaches metadata and
   relies on a subsequent net_device to apply it to the packet. The
   latter is typically used if an inner and outer IP header exist which
   require two subsequent routing lookups to be performed.
 * A new metadata_dst structure which can be attached to skbs to
   carry metadata in between subsystems. This new metadata transport
   is used to provide a single interface for VXLAN, routing and OVS
   to communicate through metadata.

The OVS interfaces remain as-is but will transparently create a real
VXLAN net_device in the background. iproute2 is extended with a new
use cases:

  VXLAN:
  ip route add 40.1.1.1/32 encap vxlan id 10 dst 50.1.1.2 dev vxlan0

  MPLS:
  ip route add 10.1.1.0/30 encap mpls 200 via inet 10.1.1.1 dev swp1

Changes since RFC:
 * Addressed comments
 * Folded in various fixes provided by Roopa, Joe, and Wei-Chun Chao
 * New static key to only collect metadata on receive if a filter exists
   which matches on the relevant fields.

Roopa Prabhu (9):
  rtnetlink: introduce new RTA_ENCAP_TYPE and RTA_ENCAP attributes
  lwtunnel: infrastructure for handling light weight tunnels like mpls
  ipv4: support for fib route lwtunnel encap attributes
  ipv6: support for fib route lwtunnel encap attributes
  lwtunnel: support dst output redirect function
  ipv4: redirect dst output to lwtunnel output
  ipv6: rt6_info output redirect to tunnel output
  mpls: export mpls functions for use by mpls iptunnels
  mpls: ip tunnel support

Thomas Graf (13):
  ip_tunnel: Make ovs_tunnel_info and ovs_key_ipv4_tunnel generic
  icmp: Don't leak original dst into ip_route_input()
  dst: Metadata destinations
  arp: Inherit metadata dst when creating ARP requests
  vxlan: Flow based tunneling
  route: Extend flow representation with tunnel key
  route: Per route IP tunnel metadata via lightweight tunnel
  fib: Add fib rule match on tunnel id
  vxlan: Factor out device configuration
  openvswitch: Make tunnel set action attach a metadata dst
  openvswitch: Move dev pointer into vport itself
  openvswitch: Abstract vport name through ovs_vport_name()
  openvswitch: Use regular VXLAN net_device device

 drivers/net/vxlan.c                  | 678 +++++++++++++++++++++--------------
 include/linux/lwtunnel.h             |   6 +
 include/linux/mpls_iptunnel.h        |   6 +
 include/linux/skbuff.h               |   1 +
 include/net/dst.h                    |   6 +-
 include/net/dst_metadata.h           |  55 +++
 include/net/fib_rules.h              |   1 +
 include/net/flow.h                   |   7 +
 include/net/ip6_fib.h                |   3 +
 include/net/ip_fib.h                 |   5 +-
 include/net/ip_tunnels.h             |  95 ++++-
 include/net/lwtunnel.h               | 144 ++++++++
 include/net/mpls_iptunnel.h          |  29 ++
 include/net/route.h                  |   1 +
 include/net/rtnetlink.h              |   1 +
 include/net/vxlan.h                  |  85 ++++-
 include/uapi/linux/fib_rules.h       |   2 +-
 include/uapi/linux/if_link.h         |   1 +
 include/uapi/linux/lwtunnel.h        |  16 +
 include/uapi/linux/mpls_iptunnel.h   |  28 ++
 include/uapi/linux/openvswitch.h     |   2 +-
 include/uapi/linux/rtnetlink.h       |  17 +
 net/Kconfig                          |   7 +
 net/core/Makefile                    |   1 +
 net/core/dev.c                       |   2 +-
 net/core/dst.c                       |  84 ++++-
 net/core/fib_rules.c                 |  24 +-
 net/core/lwtunnel.c                  | 235 ++++++++++++
 net/core/rtnetlink.c                 |  26 +-
 net/ipv4/arp.c                       |  65 ++--
 net/ipv4/fib_frontend.c              |   8 +
 net/ipv4/fib_semantics.c             |  96 ++++-
 net/ipv4/icmp.c                      |   1 +
 net/ipv4/ip_input.c                  |   3 +-
 net/ipv4/ip_tunnel_core.c            | 130 +++++++
 net/ipv4/route.c                     |  26 +-
 net/ipv6/ip6_fib.c                   |   2 +
 net/ipv6/route.c                     |  34 +-
 net/mpls/Kconfig                     |   8 +-
 net/mpls/Makefile                    |   1 +
 net/mpls/af_mpls.c                   |  11 +-
 net/mpls/internal.h                  |   9 +-
 net/mpls/mpls_iptunnel.c             | 233 ++++++++++++
 net/openvswitch/Kconfig              |  12 -
 net/openvswitch/Makefile             |   1 -
 net/openvswitch/actions.c            |  12 +-
 net/openvswitch/datapath.c           |  19 +-
 net/openvswitch/datapath.h           |   5 +-
 net/openvswitch/dp_notify.c          |   5 +-
 net/openvswitch/flow.c               |   4 +-
 net/openvswitch/flow.h               |  79 +---
 net/openvswitch/flow_netlink.c       |  84 ++++-
 net/openvswitch/flow_netlink.h       |   3 +-
 net/openvswitch/flow_table.c         |   4 +-
 net/openvswitch/vport-geneve.c       |  17 +-
 net/openvswitch/vport-gre.c          |  16 +-
 net/openvswitch/vport-internal_dev.c |  38 +-
 net/openvswitch/vport-netdev.c       | 289 ++++++++++++---
 net/openvswitch/vport-netdev.h       |  13 -
 net/openvswitch/vport-vxlan.c        | 322 -----------------
 net/openvswitch/vport-vxlan.h        |  11 -
 net/openvswitch/vport.c              |  34 +-
 net/openvswitch/vport.h              |  21 +-
 63 files changed, 2232 insertions(+), 952 deletions(-)
 create mode 100644 include/linux/lwtunnel.h
 create mode 100644 include/linux/mpls_iptunnel.h
 create mode 100644 include/net/dst_metadata.h
 create mode 100644 include/net/lwtunnel.h
 create mode 100644 include/net/mpls_iptunnel.h
 create mode 100644 include/uapi/linux/lwtunnel.h
 create mode 100644 include/uapi/linux/mpls_iptunnel.h
 create mode 100644 net/core/lwtunnel.c
 create mode 100644 net/mpls/mpls_iptunnel.c
 delete mode 100644 net/openvswitch/vport-vxlan.c
 delete mode 100644 net/openvswitch/vport-vxlan.h

-- 
2.4.3

_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2015-07-22 15:43 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-21  8:43 [PATCH net-next 00/22 v2] Lightweight & flow based encapsulation Thomas Graf
2015-07-21  8:43 ` [PATCH net-next 01/22] rtnetlink: introduce new RTA_ENCAP_TYPE and RTA_ENCAP attributes Thomas Graf
     [not found] ` <cover.1437468140.git.tgraf-G/eBtMaohhA@public.gmane.org>
2015-07-21  8:43   ` [PATCH net-next 02/22] lwtunnel: infrastructure for handling light weight tunnels like mpls Thomas Graf
2015-07-21  8:43   ` [PATCH net-next 03/22] ipv4: support for fib route lwtunnel encap attributes Thomas Graf
2015-07-21  8:43   ` [PATCH net-next 04/22] ipv6: " Thomas Graf
2015-07-21  8:43   ` [PATCH net-next 05/22] lwtunnel: support dst output redirect function Thomas Graf
2015-07-21  8:43   ` [PATCH net-next 06/22] ipv4: redirect dst output to lwtunnel output Thomas Graf
2015-07-21  8:43   ` [PATCH net-next 12/22] dst: Metadata destinations Thomas Graf
2015-07-22  8:58   ` [PATCH net-next 00/22 v2] Lightweight & flow based encapsulation thomas.morin-C0LM0jrOve7QT0dZR+AlfA
     [not found]     ` <17034_1437555506_55AF5B32_17034_5736_1_55AF5B30.8070208-C0LM0jrOve7QT0dZR+AlfA@public.gmane.org>
2015-07-22 15:43       ` roopa
2015-07-21  8:43 ` [PATCH net-next 07/22] ipv6: rt6_info output redirect to tunnel output Thomas Graf
2015-07-21  8:43 ` [PATCH net-next 08/22] mpls: export mpls functions for use by mpls iptunnels Thomas Graf
2015-07-21  8:43 ` [PATCH net-next 09/22] mpls: ip tunnel support Thomas Graf
2015-07-21  8:43 ` [PATCH net-next 10/22] ip_tunnel: Make ovs_tunnel_info and ovs_key_ipv4_tunnel generic Thomas Graf
2015-07-21  8:43 ` [PATCH net-next 11/22] icmp: Don't leak original dst into ip_route_input() Thomas Graf
2015-07-21  8:43 ` [PATCH net-next 13/22] arp: Inherit metadata dst when creating ARP requests Thomas Graf
2015-07-21  8:43 ` [PATCH net-next 14/22] vxlan: Flow based tunneling Thomas Graf
2015-07-21 17:30   ` Alexei Starovoitov
2015-07-21 17:53     ` Thomas Graf
2015-07-21  8:43 ` [PATCH net-next 15/22] route: Extend flow representation with tunnel key Thomas Graf
2015-07-21  8:44 ` [PATCH net-next 16/22] route: Per route IP tunnel metadata via lightweight tunnel Thomas Graf
2015-07-21  8:44 ` [PATCH net-next 17/22] fib: Add fib rule match on tunnel id Thomas Graf
2015-07-21  8:44 ` [PATCH net-next 18/22] vxlan: Factor out device configuration Thomas Graf
2015-07-21  8:44 ` [PATCH net-next 19/22] openvswitch: Make tunnel set action attach a metadata dst Thomas Graf
2015-07-21  8:44 ` [PATCH net-next 20/22] openvswitch: Move dev pointer into vport itself Thomas Graf
2015-07-21  8:44 ` [PATCH net-next 21/22] openvswitch: Abstract vport name through ovs_vport_name() Thomas Graf
2015-07-21  8:44 ` [PATCH net-next 22/22] openvswitch: Use regular VXLAN net_device device Thomas Graf
2015-07-21 17:39 ` [PATCH net-next 00/22 v2] Lightweight & flow based encapsulation David Miller
  -- strict thread matches above, loose matches on Subject: below --
2015-07-17 12:55 [PATCH net-next 00/22] " Thomas Graf
     [not found] ` <cover.1437137396.git.tgraf-G/eBtMaohhA@public.gmane.org>
2015-07-17 12:55   ` [PATCH net-next 05/22] lwtunnel: support dst output redirect function Thomas Graf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.