All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Or Gerlitz <ogerlitz@mellanox.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
	davem@davemloft.net, roland@kernel.org, netdev@vger.kernel.org,
	ali@mellanox.com, sean.hefty@intel.com,
	Erez Shitrit <erezsh@mellanox.co.il>,
	Doug Ledford <dledford@redhat.com>
Subject: Re: [PATCH V2 09/12] net/eipoib: Add main driver functionality
Date: Mon, 20 Aug 2012 21:57:26 +0300	[thread overview]
Message-ID: <20120820185625.GA5234@redhat.com> (raw)
In-Reply-To: <20120812205457.GA14081@redhat.com>

On Sun, Aug 12, 2012 at 11:54:57PM +0300, Michael S. Tsirkin wrote:
> > and remember that
> > this code (VM through eipoib) can talk to any IPoIB element on the
> > fabric, native,
> > virtualized,  HW/SW gateways, etc etc.
> > 
> > Or.
> 
> If you want this, then you really want a limited form of IPoIB bridging.


And to clarify that statement, here is how I would make such
IPoIB "bridging" work:

Guest side:

- Implement virtio-ipoib. This would be a device like virtio-net,
  but instead of ethernet packets, it would pass packets
  that consist of:
	IPoIB destination address
	IP packet

- this is passed to/from host without modifications, possibly with addition
  of header such as virtio net header

- flags such as broadcast can also be added to header

- like virtio net get capabilities from host and expose
  as netdev capabilities

Host side:
- create macvtap -passthrough like device that can sit on top of an
  ipoib interface
- expose this device QPN and GID to guest as hardware address
- as we get packet forward it on UD QPN or CM as appropriate
  depending on size,checksum and admin preference
- expose capabilities such as TSO
- can expose capability such as max MTU to guest too

Above means hardware address changes with migration.
So we need to notify guest when this happens.

This can be addressed from host by notifying all
neighbours.

Alternatively guest can notify all neighbours.

Notification can be done by broadcast.
This second option seems preferable.

this ipoib-vtap can support two modes
	- bridge like mode:
	  guest to guest and guest to host packets
	  can be detected by macvtap and passed
	  to/from guest directly like macvlan bridge mode

	- vepa like mode
	  guest to guest and guest to host packets
	  are sent out and looped back by IB switch
	  like macvlan vepa mode

As compared to the custom protocol I sent, it has -
Advantages: interoperates cleanly with ipoib
Disadvantages: no support for legacy (ethernet-only) guest

-- 
MST

  parent reply	other threads:[~2012-08-20 18:56 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-01 17:09 [PATCH V2 00/12] Add Ethernet IPoIB driver Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 01/12] IB/ipoib: Add rtnl_link_ops support Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 02/12] IB/ipoib: Add support for clones / multiple childs on the same partition Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 03/12] include/linux: Add private flags for IPoIB interfaces Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 04/12] IB/ipoib: Add support for acting as VIF Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 05/12] net: Add ndo_set_vif_param operation to serve eIPoIB VIFs Or Gerlitz
2012-08-02  0:17   ` Ben Hutchings
2012-08-02  8:25     ` Erez Shitrit
2012-08-01 17:09 ` [PATCH V2 06/12] net/core: Add rtnetlink support to vif parameters Or Gerlitz
2012-08-02  0:20   ` Ben Hutchings
2012-08-02 15:29     ` Erez Shitrit
2012-08-01 17:09 ` [PATCH V2 07/12] net/eipoib: Add private header file Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 08/12] net/eipoib: Add ethtool file support Or Gerlitz
2012-08-02  0:22   ` Ben Hutchings
2012-08-02  8:35     ` Erez Shitrit
2012-08-02 15:42       ` Ben Hutchings
2012-08-01 17:09 ` [PATCH V2 09/12] net/eipoib: Add main driver functionality Or Gerlitz
2012-08-02 17:15   ` Eric W. Biederman
2012-08-03 20:31     ` Ali Ayoub
2012-08-03 21:33       ` David Miller
2012-08-03 22:39         ` Ali Ayoub
2012-08-03 23:36           ` David Miller
2012-08-04 21:23             ` Or Gerlitz
2012-08-04 21:44               ` Or Gerlitz
2012-08-04 23:19                 ` Eric W. Biederman
2012-08-07  0:14             ` Ali Ayoub
2012-08-07  0:44               ` Eric W. Biederman
2012-08-07  1:21                 ` Re[2]: " Naoto MATSUMOTO
2012-08-15  9:10                   ` Re[3]: " Naoto MATSUMOTO
2012-08-07  3:33                 ` Eric W. Biederman
2012-08-08  6:04                   ` Or Gerlitz
2012-08-08  8:36                     ` Eric W. Biederman
2012-08-09  4:06                       ` Or Gerlitz
2012-08-12 14:05                         ` Michael S. Tsirkin
2012-08-07  3:37                 ` Joseph Glanville
2012-08-08  7:32                 ` Or Gerlitz
2012-08-08  9:17                   ` Eric W. Biederman
2012-08-09  4:34                     ` Or Gerlitz
2012-08-12 10:36                       ` Michael S. Tsirkin
2012-08-04  0:02           ` Ali Ayoub
2012-08-04  0:05             ` David Miller
2012-08-04  1:34             ` Eric W. Biederman
2012-08-04 21:33               ` Or Gerlitz
2012-08-05 18:50     ` Michael S. Tsirkin
2012-08-08  5:23       ` Or Gerlitz
2012-08-12 10:22         ` Michael S. Tsirkin
2012-08-12 13:09           ` Or Gerlitz
2012-08-12 13:41             ` Michael S. Tsirkin
2012-08-12 13:15           ` Or Gerlitz
2012-08-12 13:55             ` Michael S. Tsirkin
2012-08-12 14:13               ` Or Gerlitz
2012-08-12 20:54                 ` Michael S. Tsirkin
2012-08-14  8:44                   ` Or Gerlitz
2012-08-20 18:57                   ` Michael S. Tsirkin [this message]
2012-08-23  6:45                     ` Or Gerlitz
2012-08-14  7:41               ` Or Gerlitz
2012-08-12 10:54         ` Michael S. Tsirkin
2012-08-12 13:19           ` Or Gerlitz
2012-08-12 15:40         ` Eric W. Biederman
2012-08-13  8:33           ` Or Gerlitz
2012-08-13 16:08             ` Eric W. Biederman
2012-09-03 20:53       ` Or Gerlitz
2012-09-03 21:22         ` Michael S. Tsirkin
2012-09-04 18:50           ` Or Gerlitz
2012-09-04 19:31             ` Eric W. Biederman
2012-09-04 19:47               ` Or Gerlitz
2012-09-04 21:21             ` Michael S. Tsirkin
2012-09-04 18:57           ` Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 10/12] net/eipoib: Add sysfs support Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 11/12] net/eipoib: Add Makefile, Kconfig and MAINTAINERS entries Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 12/12] IB/ipoib: Add support for transmission of skbs w.o dst/neighbour Or Gerlitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120820185625.GA5234@redhat.com \
    --to=mst@redhat.com \
    --cc=ali@mellanox.com \
    --cc=davem@davemloft.net \
    --cc=dledford@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=erezsh@mellanox.co.il \
    --cc=netdev@vger.kernel.org \
    --cc=ogerlitz@mellanox.com \
    --cc=roland@kernel.org \
    --cc=sean.hefty@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.