Netdev Archive on lore.kernel.org
 help / color / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Parav Pandit <parav@mellanox.com>, Jiri Pirko <jiri@resnulli.us>,
	David M <david.m.ertman@intel.com>,
	"gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
	"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	Saeed Mahameed <saeedm@mellanox.com>,
	"kwankhede@nvidia.com" <kwankhede@nvidia.com>,
	"leon@kernel.org" <leon@kernel.org>,
	"cohuck@redhat.com" <cohuck@redhat.com>,
	Jiri Pirko <jiri@mellanox.com>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	Or Gerlitz <gerlitz.or@gmail.com>
Subject: Re: [PATCH net-next 00/19] Mellanox, mlx5 sub function support
Date: Sun, 10 Nov 2019 15:37:59 -0400
Message-ID: <20191110193759.GE31761@ziepe.ca> (raw)
In-Reply-To: <20191109092747.26a1a37e@cakuba>

On Sat, Nov 09, 2019 at 09:27:47AM -0800, Jakub Kicinski wrote:
> On Fri, 8 Nov 2019 20:44:26 -0400, Jason Gunthorpe wrote:
> > On Fri, Nov 08, 2019 at 01:45:59PM -0800, Jakub Kicinski wrote:
> > > Yes, my suggestion to use mdev was entirely based on the premise that
> > > the purpose of this work is to get vfio working.. otherwise I'm unclear
> > > as to why we'd need a bus in the first place. If this is just for
> > > containers - we have macvlan offload for years now, with no need for a
> > > separate device.  
> > 
> > This SF thing is a full fledged VF function, it is not at all like
> > macvlan. This is perhaps less important for the netdev part of the
> > world, but the difference is very big for the RDMA side, and should
> > enable VFIO too..
> 
> Well, macvlan used VMDq so it was pretty much a "legacy SR-IOV" VF.
> I'd perhaps need to learn more about RDMA to appreciate the difference.

It has a lot to do with the how the RDMA functionality works in the
HW.. At least for mlx the RDMA is 'below' all the netdev stuff, so
even though netdev has some offloaded vlan RDMA sees, essentially, the
union of all the vlan's on the system.

Which at least breaks the security model of a macvlan device for
net-namespaces.

Maybe with new HW something could be done, but today, the HW is
limited.

> > > On the RDMA/Intel front, would you mind explaining what the main
> > > motivation for the special buses is? I'm a little confurious.  
> > 
> > Well, the issue is driver binding. For years we have had these
> > multi-function netdev drivers that have a single PCI device which must
> > bind into multiple subsystems, ie mlx5 does netdev and RDMA, the cxgb
> > drivers do netdev, RDMA, SCSI initiator, SCSI target, etc. [And I
> > expect when NVMe over TCP rolls out we will have drivers like cxgb4
> > binding to 6 subsytems in total!]
> 
> What I'm missing is why is it so bad to have a driver register to
> multiple subsystems.

Well, for example, if you proposed to have a RDMA driver in
drivers/net/ethernet/foo/, I would NAK it, and I hope Dave would
too. Same for SCSI and nvme.

This Linux process is that driver code for a subsystem lives in the
subsystem and should be in a subsystem specific module. While it is
technically possible to have a giant driver, it distorts our process
in a way I don't think is good.

So, we have software layers between the large Linux subsystems just to
make the development side manageable and practical.

.. once the code lives in another subsystem, it is in a new module. A
new module requires some way to connect them all together, the driver
core is the logical way to do this connection.

I don't think a driver should be split beyond that. Even my suggestion
of a 'core' may in practice just be the netdev driver as most of the
other modules can't function without netdev. ie you can't do iSCSI
without an IP stack.

> > What is a generation? Mellanox has had a stable RDMA driver across
> > many sillicon generations. Intel looks like their new driver will
> > support at least the last two or more sillicon generations..
> > 
> > RDMA drivers are monstrous complex things, there is a big incentive to
> > not respin them every time a new chip comes out.
> 
> Ack, but then again none of the drivers gets rewritten from scratch,
> right? It's not that some "sub-drivers" get reused and some not, no?

Remarkably Intel is saying their new RDMA 'sub-driver' will be compatible
with their ICE and pre-ICE (sorry, forget the names) netdev core
drivers. 

netdev will get a different driver for each, but RDMA will use the
same driver.

Jason

  parent reply index

Thread overview: 132+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-07 16:04 Parav Pandit
2019-11-07 16:08 ` [PATCH net-next 01/19] net/mlx5: E-switch, Move devlink port close to eswitch port Parav Pandit
2019-11-07 16:08   ` [PATCH net-next 02/19] net/mlx5: E-Switch, Add SF vport, vport-rep support Parav Pandit
2019-11-07 16:08   ` [PATCH net-next 03/19] net/mlx5: Introduce SF table framework Parav Pandit
2019-11-07 16:08   ` [PATCH net-next 04/19] net/mlx5: Introduce SF life cycle APIs to allocate/free Parav Pandit
2019-11-07 16:08   ` [PATCH net-next 05/19] net/mlx5: E-Switch, Enable/disable SF's vport during SF life cycle Parav Pandit
2019-11-07 16:08   ` [PATCH net-next 06/19] net/mlx5: Add support for mediated devices in switchdev mode Parav Pandit
2019-11-08 10:32     ` Jiri Pirko
2019-11-08 16:03       ` Parav Pandit
2019-11-08 16:22         ` Jiri Pirko
2019-11-08 16:29           ` Parav Pandit
2019-11-08 18:01             ` Jiri Pirko
2019-11-08 18:04             ` Jiri Pirko
2019-11-08 18:21               ` Parav Pandit
2019-11-07 16:08   ` [PATCH net-next 07/19] vfio/mdev: Introduce sha1 based mdev alias Parav Pandit
2019-11-08 11:04     ` Jiri Pirko
2019-11-08 15:59       ` Parav Pandit
2019-11-08 16:28         ` Jiri Pirko
2019-11-08 11:10     ` Cornelia Huck
2019-11-08 16:03       ` Parav Pandit
2019-11-07 16:08   ` [PATCH net-next 08/19] vfio/mdev: Make mdev alias unique among all mdevs Parav Pandit
2019-11-08 10:49     ` Jiri Pirko
2019-11-08 15:13       ` Parav Pandit
2019-11-07 16:08   ` [PATCH net-next 09/19] vfio/mdev: Expose mdev alias in sysfs tree Parav Pandit
2019-11-08 13:22     ` Jiri Pirko
2019-11-08 18:03       ` Alex Williamson
2019-11-08 18:16         ` Jiri Pirko
2019-11-07 16:08   ` [PATCH net-next 10/19] vfio/mdev: Introduce an API mdev_alias Parav Pandit
2019-11-07 16:08   ` [PATCH net-next 11/19] vfio/mdev: Improvise mdev life cycle and parent removal scheme Parav Pandit
2019-11-08 13:01     ` Cornelia Huck
2019-11-08 16:12       ` Parav Pandit
2019-11-07 16:08   ` [PATCH net-next 12/19] devlink: Introduce mdev port flavour Parav Pandit
2019-11-07 20:38     ` Jakub Kicinski
2019-11-07 21:03       ` Parav Pandit
2019-11-08  1:17         ` Jakub Kicinski
2019-11-08  1:44           ` Parav Pandit
2019-11-08  2:20             ` Jakub Kicinski
2019-11-08  2:31               ` Parav Pandit
2019-11-08  9:46                 ` Jiri Pirko
2019-11-08 15:45                   ` Parav Pandit
2019-11-08 16:31                     ` Jiri Pirko
2019-11-08 16:43                       ` Parav Pandit
2019-11-08 18:11                         ` Jiri Pirko
2019-11-08 18:23                           ` Parav Pandit
2019-11-08 18:34                             ` Jiri Pirko
2019-11-08 18:56                               ` Parav Pandit
2019-11-08  9:30               ` Jiri Pirko
2019-11-08 15:41                 ` Parav Pandit
2019-11-07 16:08   ` [PATCH net-next 13/19] net/mlx5: Register SF devlink port Parav Pandit
2019-11-07 16:08   ` [PATCH net-next 14/19] net/mlx5: Share irqs between SFs and parent PCI device Parav Pandit
2019-11-07 16:08   ` [PATCH net-next 15/19] net/mlx5: Add load/unload routines for SF driver binding Parav Pandit
2019-11-08  9:48     ` Jiri Pirko
2019-11-08 11:13       ` Jiri Pirko
2019-11-07 16:08   ` [PATCH net-next 16/19] net/mlx5: Implement dma ops and params for mediated device Parav Pandit
2019-11-07 20:42     ` Jakub Kicinski
2019-11-07 21:30       ` Parav Pandit
2019-11-08  1:16         ` Jakub Kicinski
2019-11-08  6:37     ` Christoph Hellwig
2019-11-08 15:29       ` Parav Pandit
2019-11-07 16:08   ` [PATCH net-next 17/19] net/mlx5: Add mdev driver to bind to mdev devices Parav Pandit
2019-11-07 16:08   ` [PATCH net-next 18/19] Documentation: net: mlx5: Add mdev usage documentation Parav Pandit
2019-11-07 16:08   ` [PATCH net-next 19/19] mtty: Optionally support mtty alias Parav Pandit
2019-11-08  6:26     ` Leon Romanovsky
2019-11-08 10:45     ` Jiri Pirko
2019-11-08 15:08       ` Parav Pandit
2019-11-08 15:15         ` Jiri Pirko
2019-11-08 13:46     ` Cornelia Huck
2019-11-08 15:10       ` Parav Pandit
2019-11-08 15:28         ` Cornelia Huck
2019-11-08 15:30           ` Parav Pandit
2019-11-08 17:54             ` Alex Williamson
2019-11-08  9:51   ` [PATCH net-next 01/19] net/mlx5: E-switch, Move devlink port close to eswitch port Jiri Pirko
2019-11-08 15:50     ` Parav Pandit
2019-11-07 17:03 ` [PATCH net-next 00/19] Mellanox, mlx5 sub function support Leon Romanovsky
2019-11-07 20:10   ` Parav Pandit
2019-11-08  6:20     ` Leon Romanovsky
2019-11-08 15:01       ` Parav Pandit
2019-11-07 20:32 ` Jakub Kicinski
2019-11-07 20:52   ` Parav Pandit
2019-11-08  1:16     ` Jakub Kicinski
2019-11-08  1:49       ` Parav Pandit
2019-11-08  2:12         ` Jakub Kicinski
2019-11-08 12:12   ` Jiri Pirko
2019-11-08 14:40     ` Jason Gunthorpe
2019-11-08 15:40       ` Parav Pandit
2019-11-08 19:12         ` Jakub Kicinski
2019-11-08 20:12           ` Jason Gunthorpe
2019-11-08 20:20             ` Parav Pandit
2019-11-08 20:32               ` Jason Gunthorpe
2019-11-08 20:52                 ` gregkh
2019-11-08 20:34             ` Alex Williamson
2019-11-08 21:05               ` Jason Gunthorpe
2019-11-08 21:19                 ` gregkh
2019-11-08 21:52                 ` Alex Williamson
2019-11-08 22:48                   ` Parav Pandit
2019-11-09  0:57                     ` Jason Gunthorpe
2019-11-09 17:41                       ` Jakub Kicinski
2019-11-10 19:04                         ` Jason Gunthorpe
2019-11-10 19:48                       ` Parav Pandit
2019-11-11 14:17                         ` Jiri Pirko
2019-11-11 14:58                           ` Parav Pandit
2019-11-11 15:06                             ` Jiri Pirko
2019-11-19  4:51                               ` Parav Pandit
2019-11-09  0:12                   ` Jason Gunthorpe
2019-11-09  0:45                     ` Parav Pandit
2019-11-11  2:19                 ` Jason Wang
2019-11-08 21:45             ` Jakub Kicinski
2019-11-09  0:44               ` Jason Gunthorpe
2019-11-09  8:46                 ` gregkh
2019-11-09 11:18                   ` Jiri Pirko
2019-11-09 17:28                     ` Jakub Kicinski
2019-11-10  9:16                     ` gregkh
2019-11-09 17:27                 ` Jakub Kicinski
2019-11-10  9:18                   ` gregkh
2019-11-11  3:46                     ` Jakub Kicinski
2019-11-11  5:18                       ` Parav Pandit
2019-11-11 13:30                       ` Jiri Pirko
2019-11-11 14:14                         ` gregkh
2019-11-11 14:37                           ` Jiri Pirko
2019-11-10 19:37                   ` Jason Gunthorpe [this message]
2019-11-11  3:57                     ` Jakub Kicinski
2019-11-08 16:06     ` Parav Pandit
2019-11-08 19:06     ` Jakub Kicinski
2019-11-08 19:34       ` Parav Pandit
2019-11-08 19:48         ` Jakub Kicinski
2019-11-08 19:41       ` Jiri Pirko
2019-11-08 20:40         ` Parav Pandit
2019-11-08 21:21         ` Jakub Kicinski
2019-11-08 21:39           ` Jiri Pirko
2019-11-08 21:51             ` Jakub Kicinski
2019-11-08 22:21               ` Jiri Pirko
2019-11-07 23:57 ` David Miller

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191110193759.GE31761@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=alex.williamson@redhat.com \
    --cc=cohuck@redhat.com \
    --cc=davem@davemloft.net \
    --cc=david.m.ertman@intel.com \
    --cc=gerlitz.or@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jakub.kicinski@netronome.com \
    --cc=jiri@mellanox.com \
    --cc=jiri@resnulli.us \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=parav@mellanox.com \
    --cc=saeedm@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Netdev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/netdev/0 netdev/git/0.git
	git clone --mirror https://lore.kernel.org/netdev/1 netdev/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 netdev netdev/ https://lore.kernel.org/netdev \
		netdev@vger.kernel.org
	public-inbox-index netdev

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.netdev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git