From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS, USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6950FC10F0D for ; Thu, 7 Mar 2019 09:58:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 250A920840 for ; Thu, 7 Mar 2019 09:58:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=resnulli-us.20150623.gappssmtp.com header.i=@resnulli-us.20150623.gappssmtp.com header.b="KR0dJnlN" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726210AbfCGJ62 (ORCPT ); Thu, 7 Mar 2019 04:58:28 -0500 Received: from mail-wr1-f44.google.com ([209.85.221.44]:44636 "EHLO mail-wr1-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725747AbfCGJ62 (ORCPT ); Thu, 7 Mar 2019 04:58:28 -0500 Received: by mail-wr1-f44.google.com with SMTP id w2so16645689wrt.11 for ; Thu, 07 Mar 2019 01:58:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=resnulli-us.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=XfwRPhYGRStoGiMi+xumS+DBgH1zdiIfyoMIzymeKho=; b=KR0dJnlNmFXlcje1PLjVg4CuhI4TV+Q0qW4Sa7DEu/1aG3CODBVlveI9SDJC1VPtMP /Kgf9MqvIyNxFq+pgqx+Ps3UYRfhOJUa3ENZkO6Wj4fKzMmdGM7AHNjpVq2TLXQYsh+c VIgM+kr2FuTTiLS6ASZuzuWZgtKIs1I5KxITVJVDrruNe6dMXO4jC8TTLEKke6moEIqe YYbGfR7gMPXPpmM/zGPMnjLcwNlxBq83Se4nzFCrWwBNJJyW/8wJ97VDTvi9C2WB8Ae5 Wwi/w6p+EYoEmHKEgx+KkOOBsFAzx+RvHpRODODAihdgRksWj16+gTTBzxbsoIuIL6pu VxTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=XfwRPhYGRStoGiMi+xumS+DBgH1zdiIfyoMIzymeKho=; b=sfnvKuVnoPug742ZFzOXG2NmfByIp8UAuJPN3TSkbuN54nWI+pZIsFxWqhx2aVMTr5 Bder9BaEC+nvbyE9CMZdsPrVcheGxU60CbB1sOUKWz0ZZNl1EWWaN1RyNV0ZBUB5VK80 z7F9ZPag+F1UnyRXjJSy9dvqQUUW11R2rktMMQAdRvoa94cnVP6A2UqmO4uELquX8QTb vXR7rTZ3phzsKfHAyTpBpS/LSX/Vo0jnzhLlJRiFa325PeFT3Uvyp9xknRsUoaLsRFD4 sNO7YJ3LnWZ3viEjKetpzmvn1+RTZ/eJBU481slhHoVLF4RE/BqxUn2rPLDXA/CkqsCV CMSQ== X-Gm-Message-State: APjAAAX4jipXLo4rTooJKBYsDrWRSYsiERBQ0fcAev5lNLPLWlTbXQWn tVk6do8BuzagcTF7CfoYM8XY7w== X-Google-Smtp-Source: APXvYqyrL2g5pKH3+B98L5POlvQO/KiqqFZzJNP0OgzKOUHnKwG/L8VOoqAhdkZspfIGvYaKz230rw== X-Received: by 2002:a5d:4e44:: with SMTP id r4mr5821197wrt.228.1551952706075; Thu, 07 Mar 2019 01:58:26 -0800 (PST) Received: from localhost (mail.chocen-mesto.cz. [85.163.43.2]) by smtp.gmail.com with ESMTPSA id b197sm5208538wmd.23.2019.03.07.01.58.25 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 07 Mar 2019 01:58:25 -0800 (PST) Date: Thu, 7 Mar 2019 10:48:16 +0100 From: Jiri Pirko To: Jakub Kicinski Cc: davem@davemloft.net, netdev@vger.kernel.org, oss-drivers@netronome.com Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI ports Message-ID: <20190307094816.GA2190@nanopsycho> References: <20190301180453.17778-1-jakub.kicinski@netronome.com> <20190301180453.17778-5-jakub.kicinski@netronome.com> <20190302094116.GQ2314@nanopsycho> <20190302114847.733759a1@cakuba.netronome.com> <20190304075609.GV2314@nanopsycho> <20190304163302.7e40219e@cakuba.netronome.com> <20190305110601.GC2314@nanopsycho> <20190305091534.36200de6@cakuba.hsd1.ca.comcast.net> <20190306122037.GB2819@nanopsycho> <20190306095638.7c028bdd@cakuba.hsd1.ca.comcast.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190306095638.7c028bdd@cakuba.hsd1.ca.comcast.net> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Wed, Mar 06, 2019 at 06:56:38PM CET, jakub.kicinski@netronome.com wrote: >On Wed, 6 Mar 2019 13:20:37 +0100, Jiri Pirko wrote: >> Tue, Mar 05, 2019 at 06:15:34PM CET, jakub.kicinski@netronome.com wrote: >> >On Tue, 5 Mar 2019 12:06:01 +0100, Jiri Pirko wrote: >> >> >> >as ports. Can we invent a new command (say "partition"?) that'd take >> >> >> >the bus info where the partition is to be spawned? >> >> >> >> >> >> Got it. But the question is how different this object would be from the >> >> >> existing "port" we have today. >> >> > >> >> >They'd be where "the other side of a PCI link" is represented, >> >> >restricting ports to only ASIC's forwarding plane ports. >> >> >> >> Basically a "host port", right? It can still be the same port object, >> >> only with different flavour and attributes. So we would have: >> >> >> >> 1) pci/0000:05:00.0/0: type eth netdev enp5s0np0 >> >> flavour physical switch_id 00154d130d2f >> >> 2) pci/0000:05:00.0/10000: type eth netdev enp5s0npf0s0 >> >> flavour pci_pf pf 0 subport 0 >> >> switch_id 00154d130d2f >> >> peer pci/0000:05:00.0/1 >> >> 3) pci/0000:05:00.0/10001: type eth netdev enp5s0npf0vf0 >> >> flavour pci_vf pf 0 vf 0 >> >> switch_id 00154d130d2f >> >> peer pci/0000:05:10.1/0 >> >> 4) pci/0000:05:00.0/10001: type eth netdev enp5s0npf0s1 >> >> flavour pci_pf pf 0 subport 1 >> >> switch_id 00154d130d2f >> >> peer pci/0000:05:00.0/2 >> >> 5) pci/0000:05:00.0/1: type eth netdev enp5s0f0?? >> >> flavour host <---------------- >> >> peer pci/0000:05:00.0/10000 >> >> 6) pci/0000:05:10.1/0: type eth netdev enp5s10f0 >> >> flavour host <---------------- >> >> peer pci/0000:05:00.0/10001 >> >> 7) pci/0000:05:00.0/2: type eth netdev enp5s0f0?? >> >> flavour host <---------------- >> >> peer pci/0000:05:00.0/10001 >> >> >> >> I think it looks quite clear, it gives complete topology view. >> > >> >Okay, I have some of questions :) >> > >> >What do we use for port_index? >> >> That is just a number totally in control of the driver. Driver can >> assign it in any way. >> >> > >> >What are the operations one can perform on "host ports"? >> >> That is a good question. I would start with *none* and extend it upon >> needs. >> >> >> > >> >If we have PCI parameters, do they get set on the ASIC side of the port >> >or the host side of the port? >> >> Could you give me an example? > >Let's take msix_vec_per_pf_min as an example. > >> But I believe that on switch-port side. > >Ok. > >> >How do those behave when device is passed to VM? >> >> In case of VF? VF will have separate devlink instance (separate handle, >> probably "aliased" to the PF handle). So it would disappear from >> baremetal and appear in VM: >> $ devlink dev >> pci/0000:00:10.0 >> $ devlink dev port >> pci/0000:00:10.1/0: type eth netdev enp5s10f0 >> flavour host >> That's it for the VM. >> >> There's no linkage (peer, alias) between this and the instances on >> baremetal. > >Ok, I guess this is the main advantage from your perspective? >The fact that "host ports" are visible inside a VM? Yep. Also on baremetal. >Or do you believe that having both ends of a pipe as ports makes the >topology easier to understand? That as well. > >For creating subdevices, I don't think the handle should ever be port. >We create new ports on a devlink instance, and configure its forwarding Okay I agree. Something like: $ devlink port add pci/0000:00:10.0 ..... It's a bit confusing because "set" accepts port handle (like pci/0000:00:10.0/1). Probably better would be: $ devlink dev port add pci/0000:00:10.0 ..... >with offloads of well established Linux SW constructs. New devices are >not logically associated with other ports (see how in my patches there >are 2 "subports" but no main port on that PF - a split not a hierarchy). Right, basically you have 2 equal objects. Makes sense. > >How we want to model forwarding inside a VM (who configures the >underlying switching) remains unclear. I don't understand. Could you elaborate a bit? > >> >You have a VF devlink instance there - what ports does it show? >> >> See above. >> >> >> > >> >How do those look when the PF is connected to another host? Do they >> >get spawned at all? >> >> What do you mean by "PF is connected to another host"? > >Either "SmartNIC": > >http://www.mellanox.com/products/smartnic/?ls=gppc&lsd=SmartNIC-gen-smartnic&gclid=EAIaIQobChMIxIrGmYju4AIVy5yzCh2SFwQJEAAYASAAEgIui_D_BwE > >or > >Multi-host NIC: http://www.mellanox.com/page/multihost Right. So in this case, I think that the hostport on specific host should see devlink instance and the hostport. However, the switchports should be only on one selected host (I don't see how to do that differently) > >> >Will this not be confusing to DSA folks who have a CPU port? >> >> Why do you think so? > >Host and CPU sound quite similar, it is unclear how they differ, and >why we have a need for both (from user's perspective). Hmm, dsa cpu port is something different. It does not have netdev associated with it. It is just a port which is physically used in order to send or receive packets on switch ports. However in our hostport case, it has user facing netdev associated and user actually uses it to send and receive packets directly (assigns ip etc).