netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Cc: "Samudrala, Sridhar" <sridhar.samudrala@intel.com>,
	netdev@vger.kernel.org, davem@davemloft.net, kuba@kernel.org,
	pabeni@redhat.com, edumazet@google.com,
	intel-wired-lan@lists.osuosl.org, jiri@nvidia.com,
	anthony.l.nguyen@intel.com, alexandr.lobakin@intel.com,
	wojciech.drewek@intel.com, lukasz.czapnik@intel.com,
	shiraz.saleem@intel.com, jesse.brandeburg@intel.com,
	mustafa.ismail@intel.com, przemyslaw.kitszel@intel.com,
	piotr.raczynski@intel.com, jacob.e.keller@intel.com,
	david.m.ertman@intel.com, leszek.kaliszczuk@intel.com
Subject: Re: [PATCH net-next 00/13] resource management using devlink reload
Date: Tue, 15 Nov 2022 11:32:14 +0200	[thread overview]
Message-ID: <Y3NcnnNtmL+SSLU+@unreal> (raw)
In-Reply-To: <Y3NWMVF2LV/0lqJX@localhost.localdomain>

On Tue, Nov 15, 2022 at 10:04:49AM +0100, Michal Swiatkowski wrote:
> On Tue, Nov 15, 2022 at 10:11:10AM +0200, Leon Romanovsky wrote:
> > On Tue, Nov 15, 2022 at 08:12:52AM +0100, Michal Swiatkowski wrote:
> > > On Mon, Nov 14, 2022 at 07:07:54PM +0200, Leon Romanovsky wrote:
> > > > On Mon, Nov 14, 2022 at 09:31:11AM -0600, Samudrala, Sridhar wrote:
> > > > > On 11/14/2022 7:23 AM, Leon Romanovsky wrote:
> > > > > > On Mon, Nov 14, 2022 at 01:57:42PM +0100, Michal Swiatkowski wrote:
> > > > > > > Currently the default value for number of PF vectors is number of CPUs.
> > > > > > > Because of that there are cases when all vectors are used for PF
> > > > > > > and user can't create more VFs. It is hard to set default number of
> > > > > > > CPUs right for all different use cases. Instead allow user to choose
> > > > > > > how many vectors should be used for various features. After implementing
> > > > > > > subdevices this mechanism will be also used to set number of vectors
> > > > > > > for subfunctions.
> > > > > > > 
> > > > > > > The idea is to set vectors for eth or VFs using devlink resource API.
> > > > > > > New value of vectors will be used after devlink reinit. Example
> > > > > > > commands:
> > > > > > > $ sudo devlink resource set pci/0000:31:00.0 path msix/msix_eth size 16
> > > > > > > $ sudo devlink dev reload pci/0000:31:00.0
> > > > > > > After reload driver will work with 16 vectors used for eth instead of
> > > > > > > num_cpus.
> > > > > > By saying "vectors", are you referring to MSI-X vectors?
> > > > > > If yes, you have specific interface for that.
> > > > > > https://lore.kernel.org/linux-pci/20210314124256.70253-1-leon@kernel.org/
> > > > > 
> > > > > This patch series is exposing a resources API to split the device level MSI-X vectors
> > > > > across the different functions supported by the device (PF, RDMA, SR-IOV VFs and
> > > > > in future subfunctions). Today this is all hidden in a policy implemented within
> > > > > the PF driver.
> > > > 
> > > > Maybe we are talking about different VFs, but if you refer to PCI VFs,
> > > > the amount of MSI-X comes from PCI config space for that specific VF.
> > > > 
> > > > You shouldn't set any value through netdev as it will cause to
> > > > difference in output between lspci (which doesn't require any driver)
> > > > and your newly set number.
> > > 
> > > If I understand correctly, lspci shows the MSI-X number for individual
> > > VF. Value set via devlink is the total number of MSI-X that can be used
> > > when creating VFs. 
> > 
> > Yes and no, lspci shows how much MSI-X vectors exist from HW point of
> > view. Driver can use less than that. It is exactly as your proposed
> > devlink interface.
> > 
> > 
> 
> Ok, I have to take a closer look at it. So, are You saing that we should
> drop this devlink solution and use sysfs interface fo VFs or are You
> fine with having both? What with MSI-X allocation for subfunction?

You should drop for VFs and PFs and keep it for SFs only.

> 
> > > As Jake said I will fix the code to track both values. Thanks for pointing the patch.
> > > 
> > > > 
> > > > Also in RDMA case, it is not clear what will you achieve by this
> > > > setting too.
> > > >
> > > 
> > > We have limited number of MSI-X (1024) in the device. Because of that
> > > the amount of MSI-X for each feature is set to the best values. Half for
> > > ethernet, half for RDMA. This patchset allow user to change this values.
> > > If he wants more MSI-X for ethernet, he can decrease MSI-X for RDMA.
> > 
> > RDMA devices doesn't have PCI logic and everything is controlled through
> > you main core module. It means that when you create RDMA auxiliary device,
> > it will be connected to netdev (RoCE and iWARP) and that netdev should
> > deal with vectors. So I still don't understand what does it mean "half
> > for RDMA".
> >
> 
> Yes, it is controlled by module, but during probe, MSI-X vectors for RDMA
> are reserved and can't be used by ethernet. For example I have
> 64 CPUs, when loading I get 64 vectors from HW for ethernet and 64 for
> RDMA. The vectors for RDMA will be consumed by irdma driver, so I won't
> be able to use it in ethernet and vice versa.
> 
> By saing it can't be used I mean that irdma driver received the MSI-X
> vectors number and it is using them (connected them with RDMA interrupts).
> 
> Devlink resource is a way to change the number of MSI-X vectors that
> will be reserved for RDMA. You wrote that netdev should deal with
> vectors, but how netdev will know how many vectors should go to RDMA aux
> device? Does there an interface for setting the vectors amount for RDMA
> device?

When RDMA core adds device, it calls to irdma_init_rdma_device() and
num_comp_vectors is actually the number of MSI-X vectors which you want
to give to that device.

I'm trying to say that probably you shouldn't reserve specific vectors
for both ethernet and RDMA and simply share same vectors. RDMA applications
that care about performance set comp_vector through user space verbs anyway.

Thanks

> 
> Thanks
> 
> > Thanks

  reply	other threads:[~2022-11-15  9:32 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-14 12:57 [PATCH net-next 00/13] resource management using devlink reload Michal Swiatkowski
2022-11-14 12:57 ` [PATCH net-next 01/13] ice: move RDMA init to ice_idc.c Michal Swiatkowski
2022-11-14 12:57 ` [PATCH net-next 02/13] ice: alloc id for RDMA using xa_array Michal Swiatkowski
2022-11-14 12:57 ` [PATCH net-next 03/13] ice: cleanup in VSI config/deconfig code Michal Swiatkowski
2022-11-14 12:57 ` [PATCH net-next 04/13] ice: split ice_vsi_setup into smaller functions Michal Swiatkowski
2022-11-15  5:08   ` Jakub Kicinski
2022-11-15  6:49     ` Michal Swiatkowski
2022-11-14 12:57 ` [PATCH net-next 05/13] ice: stop hard coding the ICE_VSI_CTRL location Michal Swiatkowski
2022-11-14 12:57 ` [PATCH net-next 06/13] ice: split probe into smaller functions Michal Swiatkowski
2022-11-14 12:57 ` [PATCH net-next 07/13] ice: sync netdev filters after clearing VSI Michal Swiatkowski
2022-11-14 12:57 ` [PATCH net-next 08/13] ice: move VSI delete outside deconfig Michal Swiatkowski
2022-11-14 12:57 ` [PATCH net-next 09/13] ice: update VSI instead of init in some case Michal Swiatkowski
2022-11-14 12:57 ` [PATCH net-next 10/13] ice: implement devlink reinit action Michal Swiatkowski
2022-11-14 12:57 ` [PATCH net-next 11/13] ice: introduce eswitch capable flag Michal Swiatkowski
2022-11-14 12:57 ` [PATCH net-next 12/13] ice, irdma: prepare reservation of MSI-X to reload Michal Swiatkowski
2022-11-15  5:08   ` Jakub Kicinski
2022-11-15  6:49     ` Michal Swiatkowski
2022-11-14 12:57 ` [PATCH net-next 13/13] devlink, ice: add MSIX vectors as devlink resource Michal Swiatkowski
2022-11-14 15:28   ` Jiri Pirko
2022-11-14 16:03     ` Piotr Raczynski
2022-11-15  6:56       ` Michal Swiatkowski
2022-11-15 12:08       ` Jiri Pirko
2022-11-14 13:23 ` [PATCH net-next 00/13] resource management using devlink reload Leon Romanovsky
2022-11-14 15:31   ` Samudrala, Sridhar
2022-11-14 16:58     ` Keller, Jacob E
2022-11-14 17:09       ` Leon Romanovsky
2022-11-15  7:00         ` Michal Swiatkowski
2022-11-14 17:07     ` Leon Romanovsky
2022-11-15  7:12       ` Michal Swiatkowski
2022-11-15  8:11         ` Leon Romanovsky
2022-11-15  9:04           ` Michal Swiatkowski
2022-11-15  9:32             ` Leon Romanovsky [this message]
2022-11-15 10:16               ` Michal Swiatkowski
2022-11-15 12:12                 ` Leon Romanovsky
2022-11-15 14:02                   ` Michal Swiatkowski
2022-11-15 17:57                     ` Leon Romanovsky
2022-11-16  1:59                       ` Samudrala, Sridhar
2022-11-16  6:04                         ` Leon Romanovsky
2022-11-16 12:04                           ` Michal Swiatkowski
2022-11-16 17:59                             ` Leon Romanovsky
2022-11-17 11:10                               ` Michal Swiatkowski
2022-11-17 11:45                                 ` Leon Romanovsky
2022-11-17 13:39                                   ` Michal Swiatkowski
2022-11-17 17:38                                     ` Leon Romanovsky
2022-11-18  3:36                                       ` Jakub Kicinski
2022-11-18  6:20                                         ` Leon Romanovsky
2022-11-18 14:23                                           ` Saleem, Shiraz
2022-11-18 17:31                                             ` Leon Romanovsky
2022-11-20 22:24                                               ` Samudrala, Sridhar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y3NcnnNtmL+SSLU+@unreal \
    --to=leon@kernel.org \
    --cc=alexandr.lobakin@intel.com \
    --cc=anthony.l.nguyen@intel.com \
    --cc=davem@davemloft.net \
    --cc=david.m.ertman@intel.com \
    --cc=edumazet@google.com \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=jacob.e.keller@intel.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=jiri@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=leszek.kaliszczuk@intel.com \
    --cc=lukasz.czapnik@intel.com \
    --cc=michal.swiatkowski@linux.intel.com \
    --cc=mustafa.ismail@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=piotr.raczynski@intel.com \
    --cc=przemyslaw.kitszel@intel.com \
    --cc=shiraz.saleem@intel.com \
    --cc=sridhar.samudrala@intel.com \
    --cc=wojciech.drewek@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).