netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
To: Leon Romanovsky <leon@kernel.org>
Cc: "Samudrala, Sridhar" <sridhar.samudrala@intel.com>,
	netdev@vger.kernel.org, davem@davemloft.net, kuba@kernel.org,
	pabeni@redhat.com, edumazet@google.com,
	intel-wired-lan@lists.osuosl.org, jiri@nvidia.com,
	anthony.l.nguyen@intel.com, alexandr.lobakin@intel.com,
	wojciech.drewek@intel.com, lukasz.czapnik@intel.com,
	shiraz.saleem@intel.com, jesse.brandeburg@intel.com,
	mustafa.ismail@intel.com, przemyslaw.kitszel@intel.com,
	piotr.raczynski@intel.com, jacob.e.keller@intel.com,
	david.m.ertman@intel.com, leszek.kaliszczuk@intel.com
Subject: Re: [PATCH net-next 00/13] resource management using devlink reload
Date: Thu, 17 Nov 2022 14:39:50 +0100	[thread overview]
Message-ID: <Y3Y5phsWzatdnwok@localhost.localdomain> (raw)
In-Reply-To: <Y3Ye4kwmtPrl33VW@unreal>

On Thu, Nov 17, 2022 at 01:45:38PM +0200, Leon Romanovsky wrote:
> On Thu, Nov 17, 2022 at 12:10:21PM +0100, Michal Swiatkowski wrote:
> > On Wed, Nov 16, 2022 at 07:59:43PM +0200, Leon Romanovsky wrote:
> > > On Wed, Nov 16, 2022 at 01:04:36PM +0100, Michal Swiatkowski wrote:
> > > > On Wed, Nov 16, 2022 at 08:04:56AM +0200, Leon Romanovsky wrote:
> > > > > On Tue, Nov 15, 2022 at 07:59:06PM -0600, Samudrala, Sridhar wrote:
> > > > > > On 11/15/2022 11:57 AM, Leon Romanovsky wrote:
> > > > > 
> > > > > <...>
> > > > > 
> > > > > > > > In case of ice driver lspci -vs shows:
> > > > > > > > Capabilities: [70] MSI-X: Enable+ Count=1024 Masked
> > > > > > > > 
> > > > > > > > so all vectors that hw supports (PFs, VFs, misc, etc). Because of that
> > > > > > > > total number of MSI-X in the devlink example from cover letter is 1024.
> > > > > > > > 
> > > > > > > > I see that mellanox shows:
> > > > > > > > Capabilities: [9c] MSI-X: Enable+ Count=64 Masked
> > > > > > > > 
> > > > > > > > I assume that 64 is in this case MSI-X ony for this one PF (it make
> > > > > > > > sense).
> > > > > > > Yes and PF MSI-X count can be changed through FW configuration tool, as
> > > > > > > we need to write new value when the driver is unbound and we need it to
> > > > > > > be persistent. Users are expecting to see "stable" number any time they
> > > > > > > reboot the server. It is not the case for VFs, as they are explicitly
> > > > > > > created after reboots and start "fresh" after every boot.
> > > > > > > 
> > > > > > > So we set large enough but not too large value as a default for PFs.
> > > > > > > If you find sane model of how to change it through kernel, you can count
> > > > > > > on our support.
> > > > > > 
> > > > > > I guess one main difference is that in case of ice, PF driver manager resources
> > > > > > for all its associated functions, not the FW. So the MSI-X count reported for PF
> > > > > > shows the total vectors(PF netdev, VFs, rdma, SFs). VFs talk to PF over a mailbox
> > > > > > to get their MSI-X vector information.
> > > > > 
> > > > > What is the output of lspci for ice VF when the driver is not bound?
> > > > > 
> > > > > Thanks
> > > > > 
> > > > 
> > > > It is the same after creating and after unbonding:
> > > > Capabilities: [70] MSI-X: Enable- Count=17 Masked-
> > > > 
> > > > 17, because 16 for traffic and 1 for mailbox.
> > > 
> > > Interesting, I think that your PF violates PCI spec as it always
> > > uses word "function" and not "device" while talks about MSI-X related
> > > registers.
> > > 
> > > Thanks
> > > 
> > 
> > I made mistake in one comment. 1024 isn't MSI-X amount for device. On
> > ice we have 2048 for the whole device. On two ports card each PF have
> > 1024 MSI-X. Our control register mapping to the internal space looks
> > like that (Assuming two port card; one VF with 5 MSI-X created):
> > INT[PF0].FIRST	0
> > 		1
> > 		2
> > 		
> > 		.
> > 		.
> > 		.
> > 
> > 		1019	INT[VF0].FIRST	__
> > 		1020			  | interrupts used
> > 		1021			  | by VF on PF0
> > 		1022			  |
> > INT[PF0].LAST	1023	INT[VF0].LAST	__|
> > INT[PF1].FIRST	1024
> > 		1025
> > 		1026
> > 
> > 		.
> > 		.
> > 		.
> > 		
> > INT[PF1].LAST	2047
> > 
> > MSI-X entry table size for PF0 is 1024, but entry table for VF is a part
> > of PF0 physical space.
> > 
> > Do You mean that "sharing" the entry between PF and VF is a violation of
> > PCI spec? 
> 
> You should consult with your PCI specification experts. It was my
> spec interpretation, which can be wrong.
> 

Sure, I will

> 
> > Sum of MSI-X count on all function within device shouldn't be
> > grater than 2048? 
> 
> No, it is 2K per-control message/per-function.
> 
> 
> > It is hard to find sth about this in spec. I only read
> > that: "MSI-X supports a maximum table size of 2048 entries". I will
> > continue searching for information about that.
> > 
> > I don't think that from driver perspective we can change the table size
> > located in message control register.
> 
> No, you can't, unless you decide explicitly violate spec.
> 
> > 
> > I assume in mlnx the tool that You mentioned can modify this table size?
> 
> Yes, it is FW configuration tool.
>

Thanks for clarification.
In summary, if this is really violation of PCI spec we for sure will
have to fix that and resource managment implemented in this patchset
will not make sense (as config for PF MSI-X amount will have to happen
in FW).

If it isn't, is it still NACK from You? I mean, if we implement the
devlink resources managment (maybe not generic one, only define in ice)
and sysfs for setting VF MSI-X, will You be ok with that? Or using
devlink to manage MSI-X resources isn't in general good solution?

Our motivation here is to allow user to configure MSI-X allocation for
his own use case. For example giving only 1 MSI-X for ethernet and RDMA
and spending rest on VFs. If I understand correctly, on mlnx card user
can set PF MSI-X to 1 in FW configuration tool and achieve the same.
Currently we don't have other mechanism to change default number of
MSI-X on ethernet.

> Thanks
> 
> > 
> > Thanks
> > 
> > > > 
> > > > Thanks
> > > > > > 
> > > > > > 
> > > > > > 

  reply	other threads:[~2022-11-17 13:40 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-14 12:57 [PATCH net-next 00/13] resource management using devlink reload Michal Swiatkowski
2022-11-14 12:57 ` [PATCH net-next 01/13] ice: move RDMA init to ice_idc.c Michal Swiatkowski
2022-11-14 12:57 ` [PATCH net-next 02/13] ice: alloc id for RDMA using xa_array Michal Swiatkowski
2022-11-14 12:57 ` [PATCH net-next 03/13] ice: cleanup in VSI config/deconfig code Michal Swiatkowski
2022-11-14 12:57 ` [PATCH net-next 04/13] ice: split ice_vsi_setup into smaller functions Michal Swiatkowski
2022-11-15  5:08   ` Jakub Kicinski
2022-11-15  6:49     ` Michal Swiatkowski
2022-11-14 12:57 ` [PATCH net-next 05/13] ice: stop hard coding the ICE_VSI_CTRL location Michal Swiatkowski
2022-11-14 12:57 ` [PATCH net-next 06/13] ice: split probe into smaller functions Michal Swiatkowski
2022-11-14 12:57 ` [PATCH net-next 07/13] ice: sync netdev filters after clearing VSI Michal Swiatkowski
2022-11-14 12:57 ` [PATCH net-next 08/13] ice: move VSI delete outside deconfig Michal Swiatkowski
2022-11-14 12:57 ` [PATCH net-next 09/13] ice: update VSI instead of init in some case Michal Swiatkowski
2022-11-14 12:57 ` [PATCH net-next 10/13] ice: implement devlink reinit action Michal Swiatkowski
2022-11-14 12:57 ` [PATCH net-next 11/13] ice: introduce eswitch capable flag Michal Swiatkowski
2022-11-14 12:57 ` [PATCH net-next 12/13] ice, irdma: prepare reservation of MSI-X to reload Michal Swiatkowski
2022-11-15  5:08   ` Jakub Kicinski
2022-11-15  6:49     ` Michal Swiatkowski
2022-11-14 12:57 ` [PATCH net-next 13/13] devlink, ice: add MSIX vectors as devlink resource Michal Swiatkowski
2022-11-14 15:28   ` Jiri Pirko
2022-11-14 16:03     ` Piotr Raczynski
2022-11-15  6:56       ` Michal Swiatkowski
2022-11-15 12:08       ` Jiri Pirko
2022-11-14 13:23 ` [PATCH net-next 00/13] resource management using devlink reload Leon Romanovsky
2022-11-14 15:31   ` Samudrala, Sridhar
2022-11-14 16:58     ` Keller, Jacob E
2022-11-14 17:09       ` Leon Romanovsky
2022-11-15  7:00         ` Michal Swiatkowski
2022-11-14 17:07     ` Leon Romanovsky
2022-11-15  7:12       ` Michal Swiatkowski
2022-11-15  8:11         ` Leon Romanovsky
2022-11-15  9:04           ` Michal Swiatkowski
2022-11-15  9:32             ` Leon Romanovsky
2022-11-15 10:16               ` Michal Swiatkowski
2022-11-15 12:12                 ` Leon Romanovsky
2022-11-15 14:02                   ` Michal Swiatkowski
2022-11-15 17:57                     ` Leon Romanovsky
2022-11-16  1:59                       ` Samudrala, Sridhar
2022-11-16  6:04                         ` Leon Romanovsky
2022-11-16 12:04                           ` Michal Swiatkowski
2022-11-16 17:59                             ` Leon Romanovsky
2022-11-17 11:10                               ` Michal Swiatkowski
2022-11-17 11:45                                 ` Leon Romanovsky
2022-11-17 13:39                                   ` Michal Swiatkowski [this message]
2022-11-17 17:38                                     ` Leon Romanovsky
2022-11-18  3:36                                       ` Jakub Kicinski
2022-11-18  6:20                                         ` Leon Romanovsky
2022-11-18 14:23                                           ` Saleem, Shiraz
2022-11-18 17:31                                             ` Leon Romanovsky
2022-11-20 22:24                                               ` Samudrala, Sridhar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y3Y5phsWzatdnwok@localhost.localdomain \
    --to=michal.swiatkowski@linux.intel.com \
    --cc=alexandr.lobakin@intel.com \
    --cc=anthony.l.nguyen@intel.com \
    --cc=davem@davemloft.net \
    --cc=david.m.ertman@intel.com \
    --cc=edumazet@google.com \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=jacob.e.keller@intel.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=jiri@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=leszek.kaliszczuk@intel.com \
    --cc=lukasz.czapnik@intel.com \
    --cc=mustafa.ismail@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=piotr.raczynski@intel.com \
    --cc=przemyslaw.kitszel@intel.com \
    --cc=shiraz.saleem@intel.com \
    --cc=sridhar.samudrala@intel.com \
    --cc=wojciech.drewek@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).