linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Wei Yang <weiyang@linux.vnet.ibm.com>
To: Benjamin Herrenschmidt <benh@au1.ibm.com>
Cc: Wei Yang <weiyang@linux.vnet.ibm.com>,
	Myron Stowe <myron.stowe@redhat.com>,
	linux-pci@vger.kernel.org, gwshan@linux.vnet.ibm.com,
	Donald Dutile <ddutile@redhat.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH V9 03/18] PCI: Add weak pcibios_iov_resource_size() interface
Date: Wed, 19 Nov 2014 11:21:00 +0800	[thread overview]
Message-ID: <20141119032100.GA7105@richard> (raw)
In-Reply-To: <1416363332.5704.18.camel@au1.ibm.com>

On Wed, Nov 19, 2014 at 01:15:32PM +1100, Benjamin Herrenschmidt wrote:
>On Tue, 2014-11-18 at 18:12 -0700, Bjorn Helgaas wrote:
>> 
>> Can you help me understand this?
>> 
>> We have previously called sriov_init() on the PF.  There, we sized the VF
>> BARs, which are in the PF's SR-IOV Capability (SR-IOV spec sec 3.3.14).
>> The size we discover is the amount of space required by a single VF, so
>> sriov_init() adjusts PF->resource[PCI_IOV_RESOURCES + i] by multiplying
>> that size by PCI_SRIOV_TOTAL_VF, so this PF resource is now big enough to 
>> hold the VF BAR[i] areas for all the possible VFs.
>
>So I'll let Richard (Wei) answer on the details but I'll just chime in
>about the "big picture". This isn't about changing the spacing between VFs
>which is handled by the system page size.
>
>This is about the way we create MMIO windows from the CPU to the VF BARs.
>
>Basically, we have a (limited) set of 64-bit windows we can create that
>are divided in equal sized segments (256 of them), each segment assigned
>in HW to one of our Partitionable Endpoints (aka domain).
>
>So even if we only ever create 16 VFs for a device, we need to use an
>entire of these windows, which will use 256*VF_size and thus allocate
>that much space. Also the window has to be naturally aligned.
>
>We can then assign the VF BAR to a spot inside that window that corresponds
>to the range of PEs that we have assigned to that device (which typically
>isn't going to be the beginning of the window).
>

Bjorn & Ben,

Let me try to explain it. Thanks for Ben's explanation, it would be helpful. We
are not trying to change the space between VFs.

As mentioned by Ben, we use some HW to map the MMIO space to PE. But the HW
must map 256 segments with the same size. This will lead a situation like
this.

   +------+------+        +------+------+------+------+
   |VF#0  |VF#1  |   ...  |      |VF#N-1|PF#A  |PF#B  |
   +------+------+        +------+------+------+------+

Suppose N = 254 and the HW map these 256 segments to their corresponding PE#.
Then it introduces one problem, the PF#A and PF#B have been already assigned
to some PE#. We can't map one MMIO range to two different PE#.

What we have done is to "Expand the IOV BAR" to fit the whole HW 256 segments.
By doing so, the MMIO range will look like this.

   +------+------+        +------+------+------+------+------+------+
   |VF#0  |VF#1  |   ...  |      |VF#N-1|blank |blank |PF#A  |PF#B  |
   +------+------+        +------+------+------+------+------+------+

We do some tricky to "Expand" the IOV BAR, so that make sure there would not
be some overlap between VF's PE and PF's PE.

Then this will leads to the IOV BAR size change from:
   
   IOV BAR size = (VF BAR aperture size) * VF_number

 to:
   
   IOV BAR size = (VF BAR aperture size) * 256

This is the reason we need a platform dependent method to get the VF BAR size.
Otherwise the VF BAR size would be not correct.

Now let's take a look at your example again.

  PF SR-IOV Capability
    TotalVFs = 4
    NumVFs = 4
    System Page Size = 4KB
    VF BAR0 = [mem 0x00000000-0x00000fff] (4KB at address 0)

  PF  pci_dev->resource[7] = [mem 0x00000000-0x00003fff] (16KB)
  VF1 pci_dev->resource[0] = [mem 0x00000000-0x00000fff]
  VF2 pci_dev->resource[0] = [mem 0x00001000-0x00001fff]
  VF3 pci_dev->resource[0] = [mem 0x00002000-0x00002fff]
  VF4 pci_dev->resource[0] = [mem 0x00003000-0x00003fff]

The difference after our expanding is the IOV BAR size is 256*4KB instead of
16KB. So it will look like this:

  PF  pci_dev->resource[7] = [mem 0x00000000-0x000fffff] (1024KB)
  VF1 pci_dev->resource[0] = [mem 0x00000000-0x00000fff]
  VF2 pci_dev->resource[0] = [mem 0x00001000-0x00001fff]
  VF3 pci_dev->resource[0] = [mem 0x00002000-0x00002fff]
  VF4 pci_dev->resource[0] = [mem 0x00003000-0x00003fff]
  ...
  and 252 4KB space leave not used.

So the start address and the size of VF will not change, but the PF's IOV BAR
will be expanded.

>Cheers,
>Ben.
>

-- 
Richard Yang
Help you, Help me

  reply	other threads:[~2014-11-19  3:21 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-02 15:41 [PATCH V9 00/18] Enable SRIOV on PowerNV Wei Yang
2014-11-02 15:41 ` [PATCH V9 01/18] PCI/IOV: Export interface for retrieve VF's BDF Wei Yang
2014-11-19 23:35   ` Bjorn Helgaas
2014-11-02 15:41 ` [PATCH V9 02/18] PCI: Add weak pcibios_iov_resource_alignment() interface Wei Yang
2014-11-02 15:41 ` [PATCH V9 03/18] PCI: Add weak pcibios_iov_resource_size() interface Wei Yang
2014-11-19  1:12   ` Bjorn Helgaas
2014-11-19  2:15     ` Benjamin Herrenschmidt
2014-11-19  3:21       ` Wei Yang [this message]
2014-11-19  4:26         ` Bjorn Helgaas
2014-11-19  9:27           ` Wei Yang
2014-11-19 17:23             ` Bjorn Helgaas
2014-11-19 20:51               ` Benjamin Herrenschmidt
2014-11-20  5:40                 ` Wei Yang
2014-11-20  5:39               ` Wei Yang
2014-11-02 15:41 ` [PATCH V9 04/18] PCI: Take additional PF's IOV BAR alignment in sizing and assigning Wei Yang
2014-11-02 15:41 ` [PATCH V9 05/18] powerpc/pci: Add PCI resource alignment documentation Wei Yang
2014-11-02 15:41 ` [PATCH V9 06/18] powerpc/pci: Don't unset pci resources for VFs Wei Yang
2014-11-02 15:41 ` [PATCH V9 07/18] powerpc/pci: Define pcibios_disable_device() on powerpc Wei Yang
2014-11-02 15:41 ` [PATCH V9 08/18] powrepc/pci: Refactor pci_dn Wei Yang
2014-11-19 23:30   ` Bjorn Helgaas
2014-11-20  1:02     ` Gavin Shan
2014-11-20  7:25       ` Wei Yang
2014-11-20  7:20     ` Wei Yang
2014-11-20 19:05       ` Bjorn Helgaas
2014-11-21  0:04         ` Gavin Shan
2014-11-25  9:28           ` Wei Yang
2014-11-21  1:46         ` Wei Yang
2014-11-02 15:41 ` [PATCH V9 09/18] powerpc/pci: remove pci_dn->pcidev field Wei Yang
2014-11-02 15:41 ` [PATCH V9 10/18] powerpc/powernv: Use pci_dn in PCI config accessor Wei Yang
2014-11-02 15:41 ` [PATCH V9 11/18] powerpc/powernv: Allocate pe->iommu_table dynamically Wei Yang
2014-11-02 15:41 ` [PATCH V9 12/18] powerpc/powernv: Expand VF resources according to the number of total_pe Wei Yang
2014-11-02 15:41 ` [PATCH V9 13/18] powerpc/powernv: Implement pcibios_iov_resource_alignment() on powernv Wei Yang
2014-11-02 15:41 ` [PATCH V9 14/18] powerpc/powernv: Implement pcibios_iov_resource_size() " Wei Yang
2014-11-02 15:41 ` [PATCH V9 15/18] powerpc/powernv: Shift VF resource with an offset Wei Yang
2014-11-02 15:41 ` [PATCH V9 16/18] powerpc/powernv: Allocate VF PE Wei Yang
2014-11-02 15:41 ` [PATCH V9 17/18] powerpc/powernv: Expanding IOV BAR, with m64_per_iov supported Wei Yang
2014-11-02 15:41 ` [PATCH V9 18/18] powerpc/powernv: Group VF PE when IOV BAR is big on PHB3 Wei Yang
2014-11-18 23:11 ` [PATCH V9 00/18] Enable SRIOV on PowerNV Gavin Shan
2014-11-18 23:40   ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141119032100.GA7105@richard \
    --to=weiyang@linux.vnet.ibm.com \
    --cc=benh@au1.ibm.com \
    --cc=bhelgaas@google.com \
    --cc=ddutile@redhat.com \
    --cc=gwshan@linux.vnet.ibm.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=myron.stowe@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).