qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Łukasz Gieryk" <lukasz.gieryk@linux.intel.com>
To: Klaus Jensen <its@irrelevant.dk>
Cc: Keith Busch <kbusch@kernel.org>,
	Lukasz Maniak <lukasz.maniak@linux.intel.com>,
	qemu-block@nongnu.org, qemu-devel@nongnu.org
Subject: Re: [PATCH 10/15] hw/nvme: Make max_ioqpairs and msix_qsize configurable in runtime
Date: Thu, 21 Oct 2021 15:40:12 +0200	[thread overview]
Message-ID: <20211021134012.GA30845@lgieryk-VirtualBox> (raw)
In-Reply-To: <YXBonn0gwolecWnp@apples.localdomain>

On Wed, Oct 20, 2021 at 09:06:06PM +0200, Klaus Jensen wrote:
> On Oct  7 18:24, Lukasz Maniak wrote:
> > From: Łukasz Gieryk <lukasz.gieryk@linux.intel.com>
> > 
> > The Nvme device defines two properties: max_ioqpairs, msix_qsize. Having
> > them as constants is problematic for SR-IOV support.
> > 
> > The SR-IOV feature introduces virtual resources (queues, interrupts)
> > that can be assigned to PF and its dependent VFs. Each device, following
> > a reset, should work with the configured number of queues. A single
> > constant is no longer sufficient to hold the whole state.
> > 
> > This patch tries to solve the problem by introducing additional
> > variables in NvmeCtrl’s state. The variables for, e.g., managing queues
> > are therefore organized as:
> > 
> >  - n->params.max_ioqpairs – no changes, constant set by the user.
> > 
> >  - n->max_ioqpairs - (new) value derived from n->params.* in realize();
> >                      constant through device’s lifetime.
> > 
> >  - n->(mutable_state) – (not a part of this patch) user-configurable,
> >                         specifies number of queues available _after_
> >                         reset.
> > 
> >  - n->conf_ioqpairs - (new) used in all the places instead of the ‘old’
> >                       n->params.max_ioqpairs; initialized in realize()
> >                       and updated during reset() to reflect user’s
> >                       changes to the mutable state.
> > 
> > Since the number of available i/o queues and interrupts can change in
> > runtime, buffers for sq/cqs and the MSIX-related structures are
> > allocated big enough to handle the limits, to completely avoid the
> > complicated reallocation. A helper function (nvme_update_msixcap_ts)
> > updates the corresponding capability register, to signal configuration
> > changes.
> > 
> > Signed-off-by: Łukasz Gieryk <lukasz.gieryk@linux.intel.com>
> 
> Instead of this, how about adding new parameters, say, sriov_vi_private
> and sriov_vq_private. Then, max_ioqpairs and msix_qsize are still the
> "physical" limits and the new parameters just reserve some for the
> primary controller, the rest being available for flexsible resources.

Compare your configuration:

    max_ioqpairs     = 26
    sriov_max_vfs    = 4
    sriov_vq_private = 10

with mine:

    max_ioqpairs        = 10
    sriov_max_vfs       = 4
    sriov_max_vq_per_vf = 4

In your version, if I wanted to change max_vfs but keep the same number
of flexible resources per VF, then I would have to do some math and
update max_ioparis. And then I also would have to adjust the other
interrupt-related parameter, as it's also affected. In my opinion
it's quite inconvenient.
 
Now, even if I changed the semantic of params, I would still need most
of this patch. (Let’s keep the discussion regarding if max_* fields are
necessary in the other thread).

Without virtualization, the maximum number of queues is constant. User
(i.e., nvme kernel driver) can only query this value (e.g., 10) and
needs to follow this limit.

With virtualization, the flexible resources kick in. Let's continue with
the sample numbers defined earlier (10 private + 16 flexible resources).

1) The device boots, all 16 flexible queues are assigned to the primary
   controller.
2) Nvme kernel driver queries for the limit (10+16=26) and can create/use
   up to this many queues. 
3) User via the virtualization management command unbinds some (let's
   say 2) of the flexible queues from the primary controller and assigns
   them to a secondary controller.
4) After reset, the Physical Function Device reports different limit
   (24), and when the Virtual Device shows up, it will report 1 (adminQ
   consumed the other resource). 

So I need additional variable in the state to store the intermediate
limit (24 or 1), as none of the existing params has the correct value,
and all the places that validate limits must work on the value.



  reply	other threads:[~2021-10-21 13:42 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-07 16:23 [PATCH 00/15] hw/nvme: SR-IOV with Virtualization Enhancements Lukasz Maniak
2021-10-07 16:23 ` [PATCH 01/15] pcie: Set default and supported MaxReadReq to 512 Lukasz Maniak
2021-10-07 22:12   ` Michael S. Tsirkin
2021-10-26 14:36     ` Lukasz Maniak
2021-10-26 15:37       ` Knut Omang
2021-10-07 16:23 ` [PATCH 02/15] pcie: Add support for Single Root I/O Virtualization (SR/IOV) Lukasz Maniak
2021-10-07 16:23 ` [PATCH 03/15] pcie: Add some SR/IOV API documentation in docs/pcie_sriov.txt Lukasz Maniak
2021-10-07 16:23 ` [PATCH 04/15] pcie: Add callback preceding SR-IOV VFs update Lukasz Maniak
2021-10-12  7:25   ` Michael S. Tsirkin
2021-10-12 16:06     ` Lukasz Maniak
2021-10-13  9:10       ` Michael S. Tsirkin
2021-10-15 16:24         ` Lukasz Maniak
2021-10-15 17:30           ` Michael S. Tsirkin
2021-10-20 13:30             ` Lukasz Maniak
2021-10-07 16:23 ` [PATCH 05/15] hw/nvme: Add support for SR-IOV Lukasz Maniak
2021-10-20 19:07   ` Klaus Jensen
2021-10-21 14:33     ` Lukasz Maniak
2021-11-02 14:33   ` Klaus Jensen
2021-11-02 17:33     ` Lukasz Maniak
2021-11-04 14:30       ` Lukasz Maniak
2021-11-08  7:56         ` Klaus Jensen
2021-11-10 13:42           ` Lukasz Maniak
2021-11-10 16:39             ` Klaus Jensen
2021-10-07 16:23 ` [PATCH 06/15] hw/nvme: Add support for Primary Controller Capabilities Lukasz Maniak
2021-11-02 14:34   ` Klaus Jensen
2021-10-07 16:23 ` [PATCH 07/15] hw/nvme: Add support for Secondary Controller List Lukasz Maniak
2021-11-02 14:35   ` Klaus Jensen
2021-10-07 16:23 ` [PATCH 08/15] pcie: Add 1.2 version token for the Power Management Capability Lukasz Maniak
2021-10-07 16:24 ` [PATCH 09/15] hw/nvme: Implement the Function Level Reset Lukasz Maniak
2021-11-02 14:35   ` Klaus Jensen
2021-10-07 16:24 ` [PATCH 10/15] hw/nvme: Make max_ioqpairs and msix_qsize configurable in runtime Lukasz Maniak
2021-10-18 10:06   ` Philippe Mathieu-Daudé
2021-10-18 15:53     ` Łukasz Gieryk
2021-10-20 19:06   ` Klaus Jensen
2021-10-21 13:40     ` Łukasz Gieryk [this message]
2021-11-03 12:11       ` Klaus Jensen
2021-10-20 19:26   ` Klaus Jensen
2021-10-07 16:24 ` [PATCH 11/15] hw/nvme: Calculate BAR atributes in a function Lukasz Maniak
2021-10-18  9:52   ` Philippe Mathieu-Daudé
2021-10-07 16:24 ` [PATCH 12/15] hw/nvme: Initialize capability structures for primary/secondary controllers Lukasz Maniak
2021-11-03 12:07   ` Klaus Jensen
2021-11-04 15:48     ` Łukasz Gieryk
2021-11-05  8:46       ` Łukasz Gieryk
2021-11-05 14:04         ` Łukasz Gieryk
2021-11-08  8:25           ` Klaus Jensen
2021-11-08 13:57             ` Łukasz Gieryk
2021-11-09 12:22               ` Klaus Jensen
2021-10-07 16:24 ` [PATCH 13/15] pcie: Add helpers to the SR/IOV API Lukasz Maniak
2021-10-26 16:57   ` Knut Omang
2021-10-07 16:24 ` [PATCH 14/15] hw/nvme: Add support for the Virtualization Management command Lukasz Maniak
2021-10-07 16:24 ` [PATCH 15/15] docs: Add documentation for SR-IOV and Virtualization Enhancements Lukasz Maniak
2021-10-08  6:31 ` [PATCH 00/15] hw/nvme: SR-IOV with " Klaus Jensen
2021-10-26 18:20 ` Klaus Jensen
2021-10-27 16:49   ` Lukasz Maniak
2021-11-02  7:24     ` Klaus Jensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211021134012.GA30845@lgieryk-VirtualBox \
    --to=lukasz.gieryk@linux.intel.com \
    --cc=its@irrelevant.dk \
    --cc=kbusch@kernel.org \
    --cc=lukasz.maniak@linux.intel.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).