From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A69FC43381 for ; Fri, 15 Mar 2019 22:00:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 14F44218AC for ; Fri, 15 Mar 2019 22:00:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=Mellanox.com header.i=@Mellanox.com header.b="Jt20uEgy" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726620AbfCOWAS (ORCPT ); Fri, 15 Mar 2019 18:00:18 -0400 Received: from mail-eopbgr140054.outbound.protection.outlook.com ([40.107.14.54]:34272 "EHLO EUR01-VE1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726487AbfCOWAR (ORCPT ); Fri, 15 Mar 2019 18:00:17 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=/jsj3D6FL1ld/mCyglmcbzaOyEE5utBt7ZSblYUXSVM=; b=Jt20uEgygnNYoWcKKxzOOS3T91bqxe06Ypk1tn7WosTPYTqx9ksIgx58ZcpXZGnbYvaviiDUYhcOnENS/focireQw48Yd3hIgjavzY2J5GGE00npOTdGHEqY5B3KHTGpp0FjXDnatHrFPr7c5ADtbywtXvWEUOBMacxCM25XHuU= Received: from VI1PR0501MB2271.eurprd05.prod.outlook.com (10.169.135.8) by VI1PR0501MB2863.eurprd05.prod.outlook.com (10.172.12.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1709.14; Fri, 15 Mar 2019 21:59:33 +0000 Received: from VI1PR0501MB2271.eurprd05.prod.outlook.com ([fe80::a0b8:7ed8:d657:2f59]) by VI1PR0501MB2271.eurprd05.prod.outlook.com ([fe80::a0b8:7ed8:d657:2f59%6]) with mapi id 15.20.1709.011; Fri, 15 Mar 2019 21:59:33 +0000 From: Parav Pandit To: Jiri Pirko CC: "Samudrala, Sridhar" , Jakub Kicinski , "davem@davemloft.net" , "netdev@vger.kernel.org" , "oss-drivers@netronome.com" Subject: RE: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI ports Thread-Topic: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI ports Thread-Index: AQHU0FlVGn07mv/5S0qtfXL/6zV0naX4F3YAgACpvYCAAl2OgIABFocAgACw24CAAGdAAIABP+yAgABd4gCAAQniAIABHgoAgADJ0ICAAEdZgIAECm0AgAEiPwCAAMbcgIAAc58AgACZ0oCAAKqSgIAAAXSAgAAJR4CAAPajAIAA82GAgAAGytCAABI3gIAAHF7AgAAPYgCAAAD0MIAAPSgAgACeBbCAAE9/AIAAG6zw Date: Fri, 15 Mar 2019 21:59:33 +0000 Message-ID: References: <20190313095555.0f4f92ca@cakuba.attlocal.net> <20190314073840.GA3034@nanopsycho> <20190314150945.031d1b08@cakuba.netronome.com> <20190314163915.24fd2481@cakuba.netronome.com> <4436da3d-4b99-f792-8e77-695d5958794d@intel.com> <20190315200814.GD2305@nanopsycho> In-Reply-To: <20190315200814.GD2305@nanopsycho> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=parav@mellanox.com; x-originating-ip: [208.176.44.194] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 35e1c416-f1fa-4916-9abb-08d6a991820e x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: BCL:0;PCL:0;RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600127)(711020)(4605104)(4618075)(2017052603328)(7153060)(7193020);SRVR:VI1PR0501MB2863; x-ms-traffictypediagnostic: VI1PR0501MB2863: x-microsoft-antispam-prvs: x-forefront-prvs: 09778E995A x-forefront-antispam-report: SFV:NSPM;SFS:(10009020)(376002)(396003)(346002)(136003)(366004)(39860400002)(189003)(199004)(13464003)(5660300002)(476003)(99286004)(106356001)(76176011)(8676002)(486006)(81166006)(30864003)(33656002)(11346002)(446003)(7696005)(102836004)(66574012)(71200400001)(71190400001)(6916009)(66066001)(26005)(14454004)(2906002)(9686003)(54906003)(186003)(3846002)(6116002)(14444005)(229853002)(256004)(97736004)(6436002)(74316002)(81156014)(86362001)(105586002)(25786009)(478600001)(53546011)(7736002)(6506007)(6246003)(53936002)(8936002)(55016002)(305945005)(52536014)(316002)(68736007)(93886005)(4326008);DIR:OUT;SFP:1101;SCL:1;SRVR:VI1PR0501MB2863;H:VI1PR0501MB2271.eurprd05.prod.outlook.com;FPR:;SPF:None;LANG:en;PTR:InfoNoRecords;MX:1;A:1; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam-message-info: EbAwsNBR+55ieDHTX8n6rSYn0xmowAmwjR/+8ebUgkdBVTE3NUOFBYmBPhTQcXJ6TRRGbxrHD5pspA4C321VmDlI+biluWIQE8slBy69Omsjt8N3qFtD0T45NIxpm/pXw1ID5xY+CETFlyURJGVYsDrMRrMsxiZGPAF7/2waoVUJSBVJhsVZ9xluiYNZsMRfqfjF73szjfaW+nie3fVX3sfUUoMuZJyWXlNPXtH5HFo4Xsc/JVlu+rKxs1rIc0OsewOVCmdrzr/ZD8uioQfJDB2xZ/MdiGdjXNZKs4VqibZ72ZheG+DcHhXV4DTz8/VSPZgSRVn56j7VS+8uz7PPG3HcwAegiQJGQxnHd3TAL0HzwogNlXGAVhQRYvJPUnOCi534Ug950egJUliE9Uj9PfjLIvbKALUFsIxADjkP9mQ= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: 35e1c416-f1fa-4916-9abb-08d6a991820e X-MS-Exchange-CrossTenant-originalarrivaltime: 15 Mar 2019 21:59:33.5570 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR0501MB2863 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org > -----Original Message----- > From: Jiri Pirko > Sent: Friday, March 15, 2019 3:08 PM > To: Parav Pandit > Cc: Samudrala, Sridhar ; Jakub Kicinski > ; davem@davemloft.net; > netdev@vger.kernel.org; oss-drivers@netronome.com > Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on devlink P= CI > ports >=20 > Fri, Mar 15, 2019 at 04:32:24PM CET, parav@mellanox.com wrote: > > > > > >> -----Original Message----- > >> From: Samudrala, Sridhar > >> Sent: Friday, March 15, 2019 12:58 AM > >> To: Parav Pandit ; Jakub Kicinski > >> > >> Cc: Jiri Pirko ; davem@davemloft.net; > >> netdev@vger.kernel.org; oss-drivers@netronome.com > >> Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on > >> devlink PCI ports > >> > >> > >> On 3/14/2019 7:40 PM, Parav Pandit wrote: > >> > > >> > > >> >> -----Original Message----- > >> >> From: Samudrala, Sridhar > >> >> Sent: Thursday, March 14, 2019 9:16 PM > >> >> To: Parav Pandit ; Jakub Kicinski > >> >> > >> >> Cc: Jiri Pirko ; davem@davemloft.net; > >> >> netdev@vger.kernel.org; oss-drivers@netronome.com > >> >> Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on > >> >> devlink PCI ports > >> >> > >> >> > >> >> > >> >> On 3/14/2019 6:28 PM, Parav Pandit wrote: > >> >>> > >> >>> > >> >>>> -----Original Message----- > >> >>>> From: Jakub Kicinski > >> >>>> Sent: Thursday, March 14, 2019 6:39 PM > >> >>>> To: Parav Pandit > >> >>>> Cc: Jiri Pirko ; davem@davemloft.net; > >> >>>> netdev@vger.kernel.org; oss-drivers@netronome.com > >> >>>> Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on > >> >>>> devlink PCI ports > >> >>>> > >> >>>> On Thu, 14 Mar 2019 22:35:36 +0000, Parav Pandit wrote: > >> >>>>>>> Then instances of flavour pci_vf are going to appear in the > >> >>>>>>> same devlink instance. Those are the switch ports: > >> >>>>>>> pci/0000:05:00.0/10002: type eth netdev enp5s0npf0pf0s0 > >> >>>>>>> flavour pci_vf pf 0 vf 0 > >> >>>>>>> switch_id 00154d130d2f peer > >> >>>>>>> pci/0000:05:10.1/0 > >> >>>>>>> pci/0000:05:00.0/10003: type eth netdev enp5s0npf0pf0s0 > >> >>>>>>> flavour pci_vf pf 0 vf 0 subport 1 > >> >>>>>>> switch_id 00154d130d2f peer > >> >>>>>>> pci/0000:05:10.1/1 > >> >>>>>>> > >> >>>>>>> With that, peers are going to appear too, and those are the > >> >>>>>>> actual VF/VF > >> >>>>>>> subport: > >> >>>>>>> pci/0000:05:10.1/0: type eth netdev ??? flavour pci_vf_host > >> >>>>>>> peer pci/0000:05:00.0/10002 > >> >>>>>>> pci/0000:05:10.1/1: type eth netdev ??? flavour pci_vf_host > >> >>>>>>> peer pci/0000:05:00.0/10003 > >> >>>>>>> > >> >>>>>>> Later you can push this VF along with all subports to VM. So > >> >>>>>>> in VM, you are going to see the VF like this: > >> >>>>>>> $ devlink dev > >> >>>>>>> pci/0000:00:08.0 > >> >>>>>>> $ devlink port > >> >>>>>>> pci/0000:00:08.0/0: type eth netdev ??? flavour pci_vf_host > >> >>>>>>> pci/0000:00:08.0/1: type eth netdev ??? flavour pci_vf_host > >> >>>>>>> > >> >>>>>>> And back to your question of how are they connected in eswitch= . > >> >>>>>>> That is totally up to the original user John who did the creat= ion. > >> >>>>>>> He is in charge of the eswitch on baremetal, he would > >> >>>>>>> configure the forwarding however he likes. > >> >>>>>> > >> >>>>>> Ack, so I think you're saying VM has to communicate to the > >> >>>>>> cloud environment to have this provisioned using some service > >> >>>>>> API, not a kernel API. That's what I wanted to confirm. > >> >>>>>> > >> >>>>>> I don't see any benefit to having the "host ports" under > >> >>>>>> devlink, as such I think it's a matter of preference. > >> >>>>> > >> >>>>> We need 'host ports' to configure parameters of this host port > >> >>>>> which is not exposed by the rep-netdev. > >> >>>>> Such as mac address. > >> >>>> > >> >>>> Please look at the quote of what Jiri wrote above - the host > >> >>>> port gets passed to the VM, you can't use it as a handle to set t= he > MAC. > >> >>>> > >> >>>> The way to set the MAC remains: > >> >>>> > >> >>>> # devlink port set pci/0000:05:00.0/10002 peer mac_addr > >> >>>> 00:11:22:33:44:55 > >> >>>> > >> >>> Even though it can be done, I think this is wrong model to > >> >>> program > >> >> hostport mac address using eswitch port. > >> >>> All devlink objects are control objects, so what is passed to VM > >> >>> is what is > >> >> represented by devlink. > >> >>> VF in the VM will anyway create its devlink object. > >> >>> What is wrong in programming hostport? > >> >>> It gives a very clear view to users of topology and objects. > >> >> > >> >> The VF or any subport MAC address should be configured by the > >> >> orchestration layer that is running on the hypervisor and when a > >> >> VF is assigned to a VF, the host port is not visible to the hypervi= sor. > >> > What prevents creation of hostport due to which is not visible? > >> > Hostport is control port to program host side of parameters. > >> > It should be created when user wants to program the parameters. > >> > > >> > Model is really straight forward. > >> > Program host port params using hostport object. > >> > Program switchport params using rep-netdev. > >> > >> IIUC, Jiri/Jakub are proposing creation of 2 devlink objects for each > >> port - host facing ports and switch facing ports. This is in addition > >> to the netdevs that are created today. > >> > >I am not proposing any different. > >I am proposing only two changes. > >1. control hostport params via referring hostport (not via indirect > >peer) >=20 > Not really possible. If you passthrough VF into VM, the hostport goes alo= ng > with it. >=20 No. I am sorry in showing the enumeration which is the source of confusion. Below is the right enumeration. When VF is enumerated initially in the host, where eswitch devlink instance= is located. Below enumeration is seen. First two entries shows the link between hostport and switchport. $ devlink port show pci/0000:05:00.0/10002 eth netdev flavour switchport switch_id 00154d130d2f= peer pci/0000:05:00.0/1 pci/0000:05:00.0/1 eth netdev flavour hostport switch_id 00154d130d2f peer = pci/0000:05:00.0/10002 pci/0000:05:10.1/0 eth netdev flavour hostport This entry won't be seen if VF auto probing is disabled. Because than VF is= not enumerated. As a user, I will be programming the mac address of hostport for a VF. pci/0000:05:00.0/1 eth netdev flavour hostport switch_id 00154d130d2f peer = pci/0000:05:00.0/10002 >=20 > >2. flavour should not be vf/pf, flavour should be hostport, switchport. > >Because switch is flat and agnostic of pf/vf/mdev. >=20 > Not sure. It's good to have this kind of visibility. >=20 port can have label/attribute indicating that this belong to VF-1 or mdev a= s long as you are agreeing to have mdev attribute on host port. (and not ask for abstracting it, because mdev is well defined kernel object= ). >=20 > > > >> Are you suggesting that all the devlink objects should be visible > >> only at the hypervisor layer? > >> > >Of course not. > > > >Ports and params controlled by hypervisor should be exposed at > hypervisor/eswitch wherever its parent devlink instance exist. > >Ports which should be visible inside a VM should be exposed inside a VM. > >So for a given VF, > > > >If eswitch is at hypervisor level, > >$ devlink port show > >pci/0000:05:00.0/10002 eth netdev flavour switchport switch_id > >00154d130d2f peer pci/0000:05:10.1/0 > >pci/0000:05:10.1/0 eth netdev flavour hostport switch_id 00154d130d2f > >peer pci/0000:05:00.0/10002 > > > >where VF is enumerated, > >$ devlink port show > >pci/0000:05:10.1/0 eth netdev flavour hostport >=20 > So this is how it looks like in VM, right? >=20 Yep. Once VF is mapped to VM only two entries are seen and hostport can be still= controlled. $ devlink port show pci/0000:05:00.0/10002 eth netdev flavour switchport switch_id 00154d130d2f= peer pci/0000:05:00.0/1 pci/0000:05:00.0/1 eth netdev flavour hostport switch_id 00154d130d2f peer = pci/0000:05:00.0/10002 This addresses the case for Infiniband where there is no eswitch, but hostp= orts exists and should be managed. We shouldn't be inventing new devlink APIs or create a fake sw eswitch obje= ct which doesn't exist in hw. >=20 > >This is because unprivileged VF doesn't have visibility to eswitch and i= ts > links. > > > >> I think the terminology need to be defined clearly so that we are all > >> on the same page. > >> > >> > > >> >> Currently we have ndo_set_vf_mac_addr api that works with PF > >> >> netdev, but i think we are trying to move away from that API and > >> >> do all the configuration via the port representor netdevs. > >> > This is fine rep-netdev represents eswitch port. > >> > You normally don't go to switch to program host port params. > >> > > >> >> As the mac address cannot be configured using this netdev, i think > >> >> Jakub is suggesting creating a devlink opject for each port > >> >> representor and use that interface to set peer mac address. > >> > > >> > I understand but is convoluted interface. > >> > When you program host NIC mac address you talk to iLo or BIOS. > >> > When you program switch side mac address, you go > switch/router/modem. > >> > > >> > Also programming host params on host side, also doesn't make > >> assumption that its connected to eswitch. > >> > It also doesn't assume that same connectivity for its life. > >> > > >> > If you model around how physical devices are configured, it will > >> > almost > >> never go wrong and still provides same level of flexibility. > >> > > >> >> We should be able use this to configure port vlan too. > >> >> > >> >> Also, instead of subport, can we call vport and support different > >> >> types of vports - sr-iov, siov, vmdq etc. > >> >> > >> > At switch level there are just ports. > >> > sriov, siov, mdev, vmdq are their couter part (peer) where it is > connected. > >> > > >> >>> > >> >>> Also eswitch is flat. There is no need of pf/vf flavour for port. > >> >>> It doesn't make sense to define 'mdev' flavour which we are > >> >>> already > >> >> working. > >> >>> At eswitch level it is just a port, it happen to be connected to > >> >>> vf or pf or > >> >> other objects, it doesn't matter. > >> >>> Port should be flavoured as 'hostport' or 'switchport'. > >> >>> > >> >>> > >> >>>> (using the port ids from above)