From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS, URIBL_BLOCKED,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64FABC43381 for ; Mon, 4 Mar 2019 11:28:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 27027205C9 for ; Mon, 4 Mar 2019 11:28:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=resnulli-us.20150623.gappssmtp.com header.i=@resnulli-us.20150623.gappssmtp.com header.b="TMULse7C" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726281AbfCDL2x (ORCPT ); Mon, 4 Mar 2019 06:28:53 -0500 Received: from mail-wr1-f68.google.com ([209.85.221.68]:36408 "EHLO mail-wr1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726090AbfCDL2x (ORCPT ); Mon, 4 Mar 2019 06:28:53 -0500 Received: by mail-wr1-f68.google.com with SMTP id o17so5134343wrw.3 for ; Mon, 04 Mar 2019 03:28:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=resnulli-us.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=4noW/n31ljVg2PXNcOvHUwsjUfGwE7/M3qn3+WTfeHU=; b=TMULse7C28ap/hTdjoaFNbLb0ZW1II/OZZPI7o0kR2Y6g3VBUcsEtK8yKsw8F+LRL+ CFcXS+N9P2U1Hcto8SGAVjuWDTQGC3hi60UbqUBGHUPp7mRQCtJDWK0ygJaWbi1I/Q+D XTVlvh9yzStuoVnQnP/ojf/OlwuUDEojjnVnC9qBOLeR5GodnYEhEZkschdK8Pq4VQ5w DPM5PMK+IAg/l26/AsSYY5ir15tijzC4Lpl8xehSV7Lb1GFIvVZRxpfGMNjr0ZW//Odx IJ5qp6nQ1csGyfPBjAKmfEKRk8ueqVW1egi3AVD1E4qj+6eTa4MZ08J9DNHhqsIiIxkA YWYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=4noW/n31ljVg2PXNcOvHUwsjUfGwE7/M3qn3+WTfeHU=; b=JQ6MrR8oBXspX1nA9vhhc2V2kwXylx2HFP+oRMcXwjaz3N9wtYTT0ev+8N5RAkHqh8 1k1hMxDiVQkeuXjnN+6p6HqwWmXJKiOJezqLBMvBptrULmlHgo5wZ6eC5oYLYvOgSmcf Z9v+q/vVHZkYiqRRCG2kkzLuzWNu4cMhHjTgUB2opNr4Plrx7uGN1omcQpniQFSVTf+W muwUI85u+mPXI99Hs31Gwk8ygOJC/DdLm0msn5RDnkoigKXTPL3lc1juU/PUIh3784C8 m1jGK+oeJZONIcBg/+mFUJsgk3/+7w8upyD4gUK+NY6hZfmF1BADiQuZq7wuRN7mIo/L ocrg== X-Gm-Message-State: APjAAAVgUW24GeQN0tx53gJLytnAcycsdAdOyHwPAk17k1tq4xh7C8v6 h9e4ZSXjg9srQ117eDLXVNDv5Q== X-Google-Smtp-Source: APXvYqxdsOaBBx9sq5Ze2pSR0JbE/E+2aOAXZA9K/VQcMmYlst5b2zxqcX9eivJnQSFQryS4oXVjVw== X-Received: by 2002:adf:b741:: with SMTP id n1mr11874366wre.287.1551698931264; Mon, 04 Mar 2019 03:28:51 -0800 (PST) Received: from localhost (ip-89-177-134-16.net.upcbroadband.cz. [89.177.134.16]) by smtp.gmail.com with ESMTPSA id x3sm5573843wru.76.2019.03.04.03.28.50 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 04 Mar 2019 03:28:50 -0800 (PST) Date: Mon, 4 Mar 2019 12:19:02 +0100 From: Jiri Pirko To: Jakub Kicinski Cc: davem@davemloft.net, netdev@vger.kernel.org, oss-drivers@netronome.com Subject: Re: [PATCH net-next v2 4/7] devlink: allow subports on devlink PCI ports Message-ID: <20190304111902.GX2314@nanopsycho> References: <20190301180453.17778-1-jakub.kicinski@netronome.com> <20190301180453.17778-5-jakub.kicinski@netronome.com> <20190302094116.GQ2314@nanopsycho> <20190302114847.733759a1@cakuba.netronome.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190302114847.733759a1@cakuba.netronome.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Sat, Mar 02, 2019 at 08:48:47PM CET, jakub.kicinski@netronome.com wrote: >On Sat, 2 Mar 2019 10:41:16 +0100, Jiri Pirko wrote: >> Fri, Mar 01, 2019 at 07:04:50PM CET, jakub.kicinski@netronome.com wrote: >> >PCI endpoint corresponds to a PCI device, but such device >> >can have one more more logical device ports associated with it. >> >We need a way to distinguish those. Add a PCI subport in the >> >dumps and print the info in phys_port_name appropriately. >> > >> >This is not equivalent to port splitting, there is no split >> >group. It's just a way of representing multiple netdevs on >> >a single PCI function. >> > >> >Note that the quality of being multiport pertains only to >> >the PCI function itself. A PF having multiple netdevs does >> >not mean that its VFs will also have multiple, or that VFs >> >are associated with any particular port of a multiport VF. >> > >> >Example (bus 05 device has subports, bus 82 has only one port per >> >function): >> > >> >$ devlink port >> >pci/0000:05:00.0/0: type eth netdev enp5s0np0 flavour physical >> >pci/0000:05:00.0/10000: type eth netdev enp5s0npf0s0 flavour pci_pf pf 0 subport 0 >> >pci/0000:05:00.0/4: type eth netdev enp5s0np1 flavour physical >> >pci/0000:05:00.0/11000: type eth netdev enp5s0npf0s1 flavour pci_pf pf 0 subport 1 >> >> So these subport devlink ports are eswitch ports for subports, right? >> >> Please see the following drawing: >> >> +---+ +---+ +---+ >> pfsub| 5 | vf| 6 | | 7 |pfsub >> +-+-+ +-+-+ +-+-+ >> physical link <---------+ | | | >> | | | | >> | | | | >> | | | | >> +-+-+ +-+-+ +-+-+ +-+-+ >> | 1 | | 2 | | 3 | | 4 | >> +--+---+------+---+------+---+------+---+--+ >> | physical pfsub vf pfsub | >> | port port port port | >> | | >> | eswitch | >> | | >> | | >> +------------------------------------------+ >> >> 1) pci/0000:05:00.0/0: type eth netdev enp5s0np0 flavour physical switch_id 00154d130d2f >> 2) pci/0000:05:00.0/10000: type eth netdev enp5s0npf0s0 flavour pci_pf pf 0 subport 0 switch_id 00154d130d2f >> 3) pci/0000:05:00.0/10001: type eth netdev enp5s0npf0vf0 flavour pci_vf pf 0 vf 0 switch_id 00154d130d2f >> 4) pci/0000:05:00.0/10001: type eth netdev enp5s0npf0s1 flavour pci_pf pf 0 subport 1 switch_id 00154d130d2f >> >> This is basically what you have and I think we are in sync with that. >> But what about 5,6,7? Should they have devlink port instances too? >> >> 5) pci/0000:05:00.0/1: type eth netdev enp5s0f0?? flavour ???? pf 0 subport 0 >> 6) pci/0000:05:10.1/0: type eth netdev enp5s10f0 flavour ???? pf 0 vf 0 >> 7) pci/0000:05:00.0/1: type eth netdev enp5s0f0?? flavour ???? pf 0 subport 1 >> >> These are the "peers". >> I think that there could be flavours "pci_pf" and "pci_vf". Then the >> "representors" (switch ports) could have flavours "pci_pf_port" and >> "pci_vf_port" or something like that. User can see right away >> that is not "PF" of "VF" but rather something "on the other end". >> Note there is no "switch_id" for these devlink ports that tells the user >> these devlink ports are not part of any switch. >> What do you think? > >Hmmm.. Hm. Hm. > >To me its neat if the devlink instance matches an ASIC. I think it's >kind of clear for people to understand what it stands for then. So if >we wanted to do the above we'd have to make the switch_id the first >class identifier for devlink instances, rather than the bus? But then >VF instances don't have a switch ID so that doesn't work... > >I need to think about it. > >It's also kind of strange that we have to add the noun *port* to the >flavour of... a port... So I would prefer not to have those showing up >as ports. Can we invent a new command (say "partition"?) that'd take >the bus info where the partition is to be spawned? Devlink does not supposed to be only there for switches. From the beginning the design was to handle cases where the netdev/ib_dev is not the correct handle. Not only in case you have multiple instances (ports) for one ASIC, but also in case you have only one. Example use case is port-type-change (eth->ib,ib->eth). I chose word "port" as the parent devlink instance is "dev" and if you partition the ASIC you basically got "ports", each of a different flavour. And as you said, devlink instance matches one ASIC. Therefore the devlink instance should contain all bits there are part of that ASIC, not only switch/eswitch ports. That would be very limitting.