netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jiri Pirko <jiri@resnulli.us>
To: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: davem@davemloft.net, oss-drivers@netronome.com,
	netdev@vger.kernel.org, parav@mellanox.com, jgg@mellanox.com
Subject: Re: [PATCH net-next 4/8] devlink: allow subports on devlink PCI ports
Date: Fri, 1 Mar 2019 08:25:57 +0100	[thread overview]
Message-ID: <20190301072557.GG2314@nanopsycho> (raw)
In-Reply-To: <20190228082404.5b6d1061@cakuba.netronome.com>

Thu, Feb 28, 2019 at 05:24:04PM CET, jakub.kicinski@netronome.com wrote:
>On Thu, 28 Feb 2019 09:56:24 +0100, Jiri Pirko wrote:
>> Wed, Feb 27, 2019 at 07:30:00PM CET, jakub.kicinski@netronome.com wrote:
>> >On Wed, 27 Feb 2019 13:37:53 +0100, Jiri Pirko wrote:  
>> >> Tue, Feb 26, 2019 at 07:24:32PM CET, jakub.kicinski@netronome.com wrote:  
>> >> >PCI endpoint corresponds to a PCI device, but such device
>> >> >can have one more more logical device ports associated with it.
>> >> >We need a way to distinguish those. Add a PCI subport in the
>> >> >dumps and print the info in phys_port_name appropriately.
>> >> >
>> >> >This is not equivalent to port splitting, there is no split
>> >> >group. It's just a way of representing multiple netdevs on
>> >> >a single PCI function.
>> >> >
>> >> >Note that the quality of being multiport pertains only to
>> >> >the PCI function itself. A PF having multiple netdevs does
>> >> >not mean that its VFs will also have multiple, or that VFs
>> >> >are associated with any particular port of a multiport VF.  
>> >> 
>> >> We've been discussing the problem of subport (we call it "subfunction"
>> >> or "SF") for some time internally. Turned out, this is probably harder
>> >> task to model. Please prove me wrong.
>> >> 
>> >> The nature of VF makes it a logically separate entity. It has a separate
>> >> PCI address, it should therefore have a separate devlink instance.
>> >> You can pass it through to VM, then the same devlink instance should be
>> >> created inside the VM and disappear from the host.  
>> >
>> >Depends what a devlink instance represents :/  On one hand you may want
>> >to create an instance for a VF to allow it to spawn soft ports, on the
>> >other you may want to group multiple functions together.
>> >
>> >IOW if devlink instance is for an ASIC, there should be one per device
>> >per host.  So if we start connecting multiple functions (PFs and/or VFs)
>> >to one host we should probably introduce the notion of devlink aliases
>> >or some such (so that multiple bus addresses can target the same  
>> 
>> Hmm. Like VF address -> PF address alias? That would be confusing to see
>> eswitch ports under VF devlink instance... I probably did not get you
>> right.
>
>No eswitch ports under VF, more in case of mutli-PF.  Bus addresses of
>all PFs aliasing to the same devlink instance.

The multi-PF aliasing makes sense to me.


>
>> >devlink instance).  Those less pipelined NICs can forward between
>> >ports, but still want a function per port (otherwise user space
>> >sometimes gets confused).  If we have multiple functions which are on
>> >the same "switchid" they should have a single devlink instance if you
>> >ask me.  That instance will have all the ports of the device.  
>> 
>> Okay, that makes sense. But the question it, can the same devlink
>> instance contain ports that does not have "Switchid"?
>
>No strong preference if switchid is different.  To me devlink is an ASIC
>instance, if the multiport card is constructed by copy-pasting the same
>IP twice onto a die, and the ports really are completely separate, there
>is no reason to require single devlink instance.

Okay.


>
>> I think it would be beneficial to have the switchid shown for devlink
>> ports too. Then it is clean that the devlink ports with the same
>> switchid belong to the same switch, and other ports under the same
>> devlink instance (like PF itself) is separate, but still under the same
>> ASIC.
>
>Sure, you mean in terms of UI - user space can do a link dump or get
>that from sysfs, right?

I thinking about moving it to devlink. I'll work on it more today.


>
>> >You say disappear from the host - what do you mean.  Are you referring
>> >to the VF port disappearing?  But on the switch the port is still  
>> 
>> No, VF itself. eswitch port will be still there on the host.
>> 
>> 
>> >there, and you should show the subports on the PF side IMHO.  Devlink
>> >ports should allow users to understand the topology of the switch.  
>> 
>> What do you mean by "topology"?
>
>Mostly which ports are part of the switch and what's their "flavour".
>Also (less importantly) which host netdevs are "peers" of eswitch ports.

Makes sense.


>
>> >Is spawning VMDq sub-instances the only thing we can think of that VMs
>> >may want to do?  Are there any other uses?
>> >  
>> >> SF (or subport) feels similar to that. Basically it is exactly the same
>> >> thing as VF, only does reside under PF PCI function.
>> >> 
>> >> That is why I think, for sake of consistency, it should have a separate
>> >> devlink entity as well. The problem is correct sysfs modelling and
>> >> devlink handle derived from that. Parav is working on a simple soft
>> >> bus for this purpose called "subbus". There is a RFC floating around on
>> >> Mellanox internal mailing list, looks like it is time to send it
>> >> upstream.
>> >> 
>> >> Then each PF driver which have SFs would register subbus devices
>> >> according to SFs/subports and they would be properly handled by bus
>> >> probe, devlink and devlink port and netdev instances created.
>> >> 
>> >> Ccing Parav and Jason.  
>> >
>> >You guys come from the RDMA side of the world, with which I'm less
>> >familiar, and the soft bus + spawning devices seems to be a popular
>> >design there.  Could you describe the advantages of that model for 
>> >the sake of the netdev-only folks? :)  
>> 
>> I'll try to draw some ascii art :)
>
>Yess :)
>
>> >Another term that gets thrown into the mix here is mediated devices,
>> >right?  If you wanna pass the sub-spawn-soft-port to a VM.  Or run 
>> >DPDK on some queues.
>> >
>> >To state the obvious AF_XDP and macvlan offload were are previous
>> >answers to some of those use cases.  What is the forwarding model
>> >for those subports?  Are we going to allow flower rules from VMs?
>> >Is it going to be dst MAC only?  Or is the hypervisor going to forward
>> >as it sees appropriate (OvS + "repr"/port netdev)?  

  reply	other threads:[~2019-03-01  7:35 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-26 18:24 [PATCH net-next 0/8] devlink: add PF and VF port flavours Jakub Kicinski
2019-02-26 18:24 ` [PATCH net-next 1/8] nfp: split devlink port init from registration Jakub Kicinski
2019-02-26 18:24 ` [PATCH net-next 2/8] devlink: add PF and VF port flavours Jakub Kicinski
2019-02-27 12:16   ` Jiri Pirko
2019-03-04  4:59     ` Parav Pandit
2019-03-04  7:30       ` Jiri Pirko
2019-03-20 17:29         ` Abodunrin, Akeem G
2019-03-21 12:26           ` Jiri Pirko
2019-02-27 12:23   ` Jiri Pirko
2019-02-27 12:41     ` Jiri Pirko
2019-02-27 17:23       ` Jakub Kicinski
2019-02-27 20:17         ` Jiri Pirko
2019-02-27 22:42           ` Jakub Kicinski
2019-02-28  8:44             ` Jiri Pirko
2019-02-28 16:08               ` Jakub Kicinski
2019-02-28 16:24             ` David Ahern
2019-02-26 18:24 ` [PATCH net-next 3/8] nfp: register devlink ports of all reprs Jakub Kicinski
2019-02-26 18:24 ` [PATCH net-next 4/8] devlink: allow subports on devlink PCI ports Jakub Kicinski
2019-02-27 12:37   ` Jiri Pirko
2019-02-27 18:30     ` Jakub Kicinski
2019-02-28  8:56       ` Jiri Pirko
2019-02-28 13:32         ` Jiri Pirko
2019-02-28 16:24         ` Jakub Kicinski
2019-03-01  7:25           ` Jiri Pirko [this message]
2019-03-01 16:04             ` Jakub Kicinski
2019-03-01 16:20               ` Jiri Pirko
2019-03-04 16:15       ` Jason Gunthorpe
2019-03-05  1:03         ` Jakub Kicinski
2019-03-05  1:30           ` Jason Gunthorpe
2019-03-05  2:11             ` Jakub Kicinski
2019-03-05 22:11               ` Jason Gunthorpe
2019-03-04  5:00     ` Parav Pandit
2019-02-26 18:24 ` [PATCH net-next 5/8] nfp: switch to devlink_port_get_phys_port_name() Jakub Kicinski
2019-02-26 18:24 ` [PATCH net-next 6/8] devlink: introduce port's peer netdevs Jakub Kicinski
2019-02-27 13:08   ` Jiri Pirko
2019-02-27 18:47     ` Jakub Kicinski
2019-02-28  9:00       ` Jiri Pirko
2019-02-28 16:36         ` Jakub Kicinski
2019-03-01  7:37           ` Jiri Pirko
2019-03-01 16:05             ` Jakub Kicinski
2019-03-04  5:07     ` Parav Pandit
2019-02-26 18:24 ` [PATCH net-next 7/8] nfp: expose PF " Jakub Kicinski
2019-02-26 18:24 ` [PATCH net-next 8/8] devlink: fix kdoc Jakub Kicinski
2019-02-27 13:13   ` Jiri Pirko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190301072557.GG2314@nanopsycho \
    --to=jiri@resnulli.us \
    --cc=davem@davemloft.net \
    --cc=jakub.kicinski@netronome.com \
    --cc=jgg@mellanox.com \
    --cc=netdev@vger.kernel.org \
    --cc=oss-drivers@netronome.com \
    --cc=parav@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).