From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C04AC43331 for ; Sun, 10 Nov 2019 19:04:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 385E420818 for ; Sun, 10 Nov 2019 19:04:21 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=ziepe.ca header.i=@ziepe.ca header.b="e2epuT3D" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726832AbfKJTEU (ORCPT ); Sun, 10 Nov 2019 14:04:20 -0500 Received: from mail-qt1-f194.google.com ([209.85.160.194]:38684 "EHLO mail-qt1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726800AbfKJTEU (ORCPT ); Sun, 10 Nov 2019 14:04:20 -0500 Received: by mail-qt1-f194.google.com with SMTP id p20so13266782qtq.5 for ; Sun, 10 Nov 2019 11:04:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=4X+lJuxXpXDbRkVB7/MvwoAb9ZOQMHWI6c7L1sTBKq4=; b=e2epuT3DbyEliM62E016Oc2uFJQCWmIfHYgklM7m6mhAa/R1tVadAQYMNRUxbM5yvV PivYook4+PtSMJXOjE/grGeSKGRRp7Y6HoJWLNnDusXHtMI3QoWSkKEUpynggrfzLInH GeGMN3JRfPIff3gE1HY1hMoaYR2rC6gOdnDbs3TDfFXV22OGMFQ3KuccegsCKGWjUUrL MRCWyYoXjn9FTOgqQHsmw9ea6aUzZs9dz8Rbqoi6erWLWyT10C7wghRa8BjWjjbtSyQE CvUrDsvKnnaf0G30ydtTGDL7pUuX7MAb4E7i8TENiSbS0pzp/jLXAh2wB/5NvD+gM9ad GFxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=4X+lJuxXpXDbRkVB7/MvwoAb9ZOQMHWI6c7L1sTBKq4=; b=N4iUoRW/lSqmtT7KO3BVCKszXkqp4+ad8PsJbOEikS5u3pXKcHZLo4HSl95eaCP2V6 upEBgwT8jtHv4hfPjgGUlid4HkXAkutidvm5kxzCB5HhK2cg+lRjkRpyJHfpmsunrpmH bS5vzGeO2PcZzXSKywM+d/UOoQt575EsZRGGAFPG5J2uzAfLQn5fg0v2pJRH9GOfoVaU wVfJAJv47G3SKOQw6PRUg9H7eTw9VrxCnkt4ReSztQWyfQrjN3eToFLwxkXGv7T6OMOE HwjNIGNPkFfy/bPrcscyLW/qOYj+6kKdPIv+wOEA5b6O7BzhbwtCUAgUHXcEL/PDWJu3 jfoA== X-Gm-Message-State: APjAAAWNFoSUbd9vVdL9X5crkBgM883qoKIvSQQT5m0HoGRQTxzyjDwC Op1M2QpanmVA24OatWQq0setzw== X-Google-Smtp-Source: APXvYqxLWlOngvz8JEYlq7KhAImGgPmaYLGP5ikKveosJheOQf3aQ/BeD8Yv/uQkBcuupt+Jx6Z8Ww== X-Received: by 2002:aed:2ac2:: with SMTP id t60mr22796872qtd.376.1573412659023; Sun, 10 Nov 2019 11:04:19 -0800 (PST) Received: from ziepe.ca (hlfxns017vw-142-162-113-180.dhcp-dynamic.fibreop.ns.bellaliant.net. [142.162.113.180]) by smtp.gmail.com with ESMTPSA id x65sm6410210qkd.15.2019.11.10.11.04.18 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sun, 10 Nov 2019 11:04:18 -0800 (PST) Received: from jgg by mlx.ziepe.ca with local (Exim 4.90_1) (envelope-from ) id 1iTsVV-00048v-JX; Sun, 10 Nov 2019 15:04:17 -0400 Date: Sun, 10 Nov 2019 15:04:17 -0400 From: Jason Gunthorpe To: Jakub Kicinski Cc: Parav Pandit , Alex Williamson , Jiri Pirko , David M , "gregkh@linuxfoundation.org" , "davem@davemloft.net" , "kvm@vger.kernel.org" , "netdev@vger.kernel.org" , Saeed Mahameed , "kwankhede@nvidia.com" , "leon@kernel.org" , "cohuck@redhat.com" , Jiri Pirko , "linux-rdma@vger.kernel.org" , Or Gerlitz , "Jason Wang (jasowang@redhat.com)" Subject: Re: [PATCH net-next 00/19] Mellanox, mlx5 sub function support Message-ID: <20191110190417.GD31761@ziepe.ca> References: <20191108144054.GC10956@ziepe.ca> <20191108111238.578f44f1@cakuba> <20191108201253.GE10956@ziepe.ca> <20191108133435.6dcc80bd@x1.home> <20191108210545.GG10956@ziepe.ca> <20191108145210.7ad6351c@x1.home> <20191109005708.GC31761@ziepe.ca> <20191109094103.739033a3@cakuba> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20191109094103.739033a3@cakuba> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org On Sat, Nov 09, 2019 at 09:41:03AM -0800, Jakub Kicinski wrote: > On Fri, 8 Nov 2019 20:57:08 -0400, Jason Gunthorpe wrote: > > On Fri, Nov 08, 2019 at 10:48:31PM +0000, Parav Pandit wrote: > > > We should be creating 3 different buses, instead of mdev bus being de-multiplexer of that? > > > > > > Hence, depending the device flavour specified, create such device on right bus? > > > > > > For example, > > > $ devlink create subdev pci/0000:05:00.0 flavour virtio name foo subdev_id 1 > > > $ devlink create subdev pci/0000:05:00.0 flavour mdev subdev_id 2 > > > $ devlink create subdev pci/0000:05:00.0 flavour mlx5 id 1 subdev_id 3 > > > > I like the idea of specifying what kind of interface you want at sub > > device creation time. It fits the driver model pretty well and doesn't > > require abusing the vfio mdev for binding to a netdev driver. > > Aren't the HW resources spun out in all three cases exactly identical? Exactly? No, not really. The only constant is that some chunk of the BAR is dedicated to this subedv. The BAR is flexible, so a BAR chunk configured for virtio is not going to support mlx5 mode. Aside from that, there are other differences ie - mlx5 does not need a dedicated set of MSI-X's while other modes do. There are fewer MSI-X's than SF's, so managing this is important for the admin. Even in modes which are very similar, like mlx5 vs mdev-vfio, the HW still has to be configured to provide global DMA isolation on the NIC for vfio as the IOMMU cannot be involved. This is extra overhead and should not be activated unless vfio is being used. .. and finally the driver core does not support a 'multiple-inheritance' like idea, so we can't have a 'foo_device' that is three different things. So somehow the 'flavour' of the 'struct device' has to be exposed to userspace, and it is best if this is done at device creation time so the BAR region and HW can be setup once and we don't have to have complex reconfiguration flows. Jason