All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ashish Kalra <ashish.kalra@amd.com>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: hch@lst.de, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
	hpa@zytor.com, x86@kernel.org, luto@kernel.org,
	peterz@infradead.org, dave.hansen@linux-intel.com,
	iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org,
	brijesh.singh@amd.com, Thomas.Lendacky@amd.com,
	ssg.sos.patches@amd.com, jon.grimm@amd.com, rientjes@google.com
Subject: Re: [PATCH v3] swiotlb: Adjust SWIOTBL bounce buffer size for SEV guests.
Date: Tue, 17 Nov 2020 15:33:34 +0000	[thread overview]
Message-ID: <20201117153302.GA29293@ashkalra_ubuntu_server> (raw)
In-Reply-To: <20201113211925.GA6096@char.us.oracle.com>

On Fri, Nov 13, 2020 at 04:19:25PM -0500, Konrad Rzeszutek Wilk wrote:
> On Thu, Nov 05, 2020 at 09:20:45PM +0000, Ashish Kalra wrote:
> > On Thu, Nov 05, 2020 at 03:20:07PM -0500, Konrad Rzeszutek Wilk wrote:
> > > On Thu, Nov 05, 2020 at 07:38:28PM +0000, Ashish Kalra wrote:
> > > > On Thu, Nov 05, 2020 at 02:06:49PM -0500, Konrad Rzeszutek Wilk wrote:
> > > > > .
> > > > > > > Right, so I am wondering if we can do this better.
> > > > > > > 
> > > > > > > That is you are never going to get any 32-bit devices with SEV right? That
> > > > > > > is there is nothing that bounds you to always use the memory below 4GB?
> > > > > > > 
> > > > > > 
> > > > > > We do support 32-bit PCIe passthrough devices with SEV.
> > > > > 
> > > > > Ewww..  Which devices would this be?
> > > > 
> > > > That will be difficult to predict as customers could be doing
> > > > passthrough of all kinds of devices.
> > > 
> > > But SEV is not on some 1990 hardware. It has PCIe, there is no PCI slots in there.
> > > 
> > > Is it really possible to have a PCIe device that can't do more than 32-bit DMA?
> > > 
> > > > 
> > > > > > 
> > > > > > Therefore, we can't just depend on >4G memory for SWIOTLB bounce buffering
> > > > > > when there is I/O pressure, because we do need to support device
> > > > > > passthrough of 32-bit devices.
> > > > > 
> > > > > Presumarily there is just a handful of them?
> > > > >
> > > > Again, it will be incorrect to assume this.
> > > > 
> > > > > > 
> > > > > > Considering this, we believe that this patch needs to adjust/extend
> > > > > > boot-allocation of SWIOTLB and we want to keep it simple to do this
> > > > > > within a range detemined by amount of allocated guest memory.
> > > > > 
> > > > > I would prefer to not have to revert this in a year as customers
> > > > > complain about "I paid $$$ and I am wasting half a gig on something 
> > > > > I am not using" and giving customers knobs to tweak this instead of
> > > > > doing the right thing from the start.
> > > > 
> > > > Currently, we face a lot of situations where we have to tell our
> > > > internal teams/external customers to explicitly increase SWIOTLB buffer
> > > > via the swiotlb parameter on the kernel command line, especially to
> > > > get better I/O performance numbers with SEV. 
> > > 
> > > Presumarily these are 64-bit?
> > > 
> > > And what devices do you speak off that are actually affected by 
> > > this performance? Increasing the SWIOTLB just means we have more
> > > memory, which in mind means you can have _more_ devices in the guest
> > > that won't handle the fact that DMA mapping returns an error.
> > > 
> > > Not neccessarily that one device suddenly can go faster.
> > > 
> > > > 
> > > > So by having this SWIOTLB size adjustment done implicitly (even using a
> > > > static logic) is a great win-win situation. In other words, having even
> > > > a simple and static default increase of SWIOTLB buffer size for SEV is
> > > > really useful for us.
> > > > 
> > > > We can always think of adding all kinds of heuristics to this, but that
> > > > just adds too much complexity without any predictable performance gain.
> > > > 
> > > > And to add, the patch extends the SWIOTLB size as an architecture
> > > > specific callback, currently it is a simple and static logic for SEV/x86
> > > > specific, but there is always an option to tweak/extend it with
> > > > additional logic in the future.
> > > 
> > > Right, and that is what I would like to talk about as I think you
> > > are going to disappear (aka, busy with other stuff) after this patch goes in.
> > > 
> > > I need to understand this more than "performance" and "internal teams"
> > > requirements to come up with a better way going forward as surely other
> > > platforms will hit the same issue anyhow.
> > > 
> > > Lets break this down:
> > > 
> > > How does the performance improve for one single device if you increase the SWIOTLB?
> > > Is there a specific device/driver that you can talk about that improve with this patch?
> > > 
> > > 
> > 
> > Yes, these are mainly for multi-queue devices such as NICs or even
> > multi-queue virtio. 
> > 
> > This basically improves performance with concurrent DMA, hence,
> > basically multi-queue devices.
> 
> OK, and for _1GB_ guest - what are the "internal teams/external customers" amount 
> of CPUs they use? Please lets use real use-cases.

>> I am sure you will understand we cannot share any external customer
>> data as all that customer information is proprietary.
>>
>> In similar situation if you have to share Oracle data, you will
>> surely have the same concerns and i don't think you will be able
>> to share any such information externally, i.e., outside Oracle.
>>
>I am asking for a simple query - what amount of CPUs does a 1GB
>guest have? The reason for this should be fairly obvious - if
>it is a 1vCPU, then there is no multi-queue and the existing
>SWIOTLB pool size as it is OK.
>
>If however there are say 2 and multiqueue is enabled, that
>gives me an idea of how many you use and I can find out what
>the maximum pool size usage of virtio there is with that configuration.

Again we cannot share any customer data.

Also i don't think there can be a definitive answer to how many vCPUs a
1GB guest will have, it will depend on what kind of configuration we are
testing.

For example, i usually setup 4-16 vCPUs for as low as 512M configured
gueest memory.

I have been also testing with 16 vCPUs configuration for 512M-1G guest
memory with Mellanox SRIOV NICs, and this will be a multi-queue NIC
device environment.

So we might be having less configured guest memory, but we still might
be using that configuration with I/O intensive workloads.

Thanks,
Ashish

WARNING: multiple messages have this Message-ID (diff)
From: Ashish Kalra <ashish.kalra@amd.com>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Thomas.Lendacky@amd.com, jon.grimm@amd.com,
	brijesh.singh@amd.com, ssg.sos.patches@amd.com,
	dave.hansen@linux-intel.com, peterz@infradead.org,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	iommu@lists.linux-foundation.org, mingo@redhat.com, bp@alien8.de,
	luto@kernel.org, hpa@zytor.com, rientjes@google.com,
	tglx@linutronix.de, hch@lst.de
Subject: Re: [PATCH v3] swiotlb: Adjust SWIOTBL bounce buffer size for SEV guests.
Date: Tue, 17 Nov 2020 15:33:34 +0000	[thread overview]
Message-ID: <20201117153302.GA29293@ashkalra_ubuntu_server> (raw)
In-Reply-To: <20201113211925.GA6096@char.us.oracle.com>

On Fri, Nov 13, 2020 at 04:19:25PM -0500, Konrad Rzeszutek Wilk wrote:
> On Thu, Nov 05, 2020 at 09:20:45PM +0000, Ashish Kalra wrote:
> > On Thu, Nov 05, 2020 at 03:20:07PM -0500, Konrad Rzeszutek Wilk wrote:
> > > On Thu, Nov 05, 2020 at 07:38:28PM +0000, Ashish Kalra wrote:
> > > > On Thu, Nov 05, 2020 at 02:06:49PM -0500, Konrad Rzeszutek Wilk wrote:
> > > > > .
> > > > > > > Right, so I am wondering if we can do this better.
> > > > > > > 
> > > > > > > That is you are never going to get any 32-bit devices with SEV right? That
> > > > > > > is there is nothing that bounds you to always use the memory below 4GB?
> > > > > > > 
> > > > > > 
> > > > > > We do support 32-bit PCIe passthrough devices with SEV.
> > > > > 
> > > > > Ewww..  Which devices would this be?
> > > > 
> > > > That will be difficult to predict as customers could be doing
> > > > passthrough of all kinds of devices.
> > > 
> > > But SEV is not on some 1990 hardware. It has PCIe, there is no PCI slots in there.
> > > 
> > > Is it really possible to have a PCIe device that can't do more than 32-bit DMA?
> > > 
> > > > 
> > > > > > 
> > > > > > Therefore, we can't just depend on >4G memory for SWIOTLB bounce buffering
> > > > > > when there is I/O pressure, because we do need to support device
> > > > > > passthrough of 32-bit devices.
> > > > > 
> > > > > Presumarily there is just a handful of them?
> > > > >
> > > > Again, it will be incorrect to assume this.
> > > > 
> > > > > > 
> > > > > > Considering this, we believe that this patch needs to adjust/extend
> > > > > > boot-allocation of SWIOTLB and we want to keep it simple to do this
> > > > > > within a range detemined by amount of allocated guest memory.
> > > > > 
> > > > > I would prefer to not have to revert this in a year as customers
> > > > > complain about "I paid $$$ and I am wasting half a gig on something 
> > > > > I am not using" and giving customers knobs to tweak this instead of
> > > > > doing the right thing from the start.
> > > > 
> > > > Currently, we face a lot of situations where we have to tell our
> > > > internal teams/external customers to explicitly increase SWIOTLB buffer
> > > > via the swiotlb parameter on the kernel command line, especially to
> > > > get better I/O performance numbers with SEV. 
> > > 
> > > Presumarily these are 64-bit?
> > > 
> > > And what devices do you speak off that are actually affected by 
> > > this performance? Increasing the SWIOTLB just means we have more
> > > memory, which in mind means you can have _more_ devices in the guest
> > > that won't handle the fact that DMA mapping returns an error.
> > > 
> > > Not neccessarily that one device suddenly can go faster.
> > > 
> > > > 
> > > > So by having this SWIOTLB size adjustment done implicitly (even using a
> > > > static logic) is a great win-win situation. In other words, having even
> > > > a simple and static default increase of SWIOTLB buffer size for SEV is
> > > > really useful for us.
> > > > 
> > > > We can always think of adding all kinds of heuristics to this, but that
> > > > just adds too much complexity without any predictable performance gain.
> > > > 
> > > > And to add, the patch extends the SWIOTLB size as an architecture
> > > > specific callback, currently it is a simple and static logic for SEV/x86
> > > > specific, but there is always an option to tweak/extend it with
> > > > additional logic in the future.
> > > 
> > > Right, and that is what I would like to talk about as I think you
> > > are going to disappear (aka, busy with other stuff) after this patch goes in.
> > > 
> > > I need to understand this more than "performance" and "internal teams"
> > > requirements to come up with a better way going forward as surely other
> > > platforms will hit the same issue anyhow.
> > > 
> > > Lets break this down:
> > > 
> > > How does the performance improve for one single device if you increase the SWIOTLB?
> > > Is there a specific device/driver that you can talk about that improve with this patch?
> > > 
> > > 
> > 
> > Yes, these are mainly for multi-queue devices such as NICs or even
> > multi-queue virtio. 
> > 
> > This basically improves performance with concurrent DMA, hence,
> > basically multi-queue devices.
> 
> OK, and for _1GB_ guest - what are the "internal teams/external customers" amount 
> of CPUs they use? Please lets use real use-cases.

>> I am sure you will understand we cannot share any external customer
>> data as all that customer information is proprietary.
>>
>> In similar situation if you have to share Oracle data, you will
>> surely have the same concerns and i don't think you will be able
>> to share any such information externally, i.e., outside Oracle.
>>
>I am asking for a simple query - what amount of CPUs does a 1GB
>guest have? The reason for this should be fairly obvious - if
>it is a 1vCPU, then there is no multi-queue and the existing
>SWIOTLB pool size as it is OK.
>
>If however there are say 2 and multiqueue is enabled, that
>gives me an idea of how many you use and I can find out what
>the maximum pool size usage of virtio there is with that configuration.

Again we cannot share any customer data.

Also i don't think there can be a definitive answer to how many vCPUs a
1GB guest will have, it will depend on what kind of configuration we are
testing.

For example, i usually setup 4-16 vCPUs for as low as 512M configured
gueest memory.

I have been also testing with 16 vCPUs configuration for 512M-1G guest
memory with Mellanox SRIOV NICs, and this will be a multi-queue NIC
device environment.

So we might be having less configured guest memory, but we still might
be using that configuration with I/O intensive workloads.

Thanks,
Ashish
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

  parent reply	other threads:[~2020-11-17 15:34 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-04 22:08 [PATCH v3] swiotlb: Adjust SWIOTBL bounce buffer size for SEV guests Ashish Kalra
2020-11-04 22:08 ` Ashish Kalra
2020-11-04 22:14 ` Konrad Rzeszutek Wilk
2020-11-04 22:14   ` Konrad Rzeszutek Wilk
2020-11-04 22:39   ` Ashish Kalra
2020-11-04 22:39     ` Ashish Kalra
2020-11-05 17:43     ` Konrad Rzeszutek Wilk
2020-11-05 17:43       ` Konrad Rzeszutek Wilk
2020-11-05 18:41       ` Ashish Kalra
2020-11-05 18:41         ` Ashish Kalra
2020-11-05 19:06         ` Konrad Rzeszutek Wilk
2020-11-05 19:06           ` Konrad Rzeszutek Wilk
2020-11-05 19:38           ` Ashish Kalra
2020-11-05 19:38             ` Ashish Kalra
2020-11-05 20:20             ` Konrad Rzeszutek Wilk
2020-11-05 20:20               ` Konrad Rzeszutek Wilk
2020-11-05 21:20               ` Ashish Kalra
2020-11-05 21:20                 ` Ashish Kalra
2020-11-13 21:19                 ` Konrad Rzeszutek Wilk
2020-11-13 21:19                   ` Konrad Rzeszutek Wilk
2020-11-13 22:10                   ` Ashish Kalra
2020-11-13 22:10                     ` Ashish Kalra
2020-11-17 15:33                   ` Ashish Kalra [this message]
2020-11-17 15:33                     ` Ashish Kalra
2020-11-17 17:00                     ` Konrad Rzeszutek Wilk
2020-11-17 17:00                       ` Konrad Rzeszutek Wilk
2020-11-17 17:38                       ` Ashish Kalra
2020-11-17 17:38                         ` Ashish Kalra
2020-11-17 19:04                         ` Kalra, Ashish
2020-11-17 20:31                           ` Konrad Rzeszutek Wilk
2020-11-17 20:31                             ` Konrad Rzeszutek Wilk
2020-11-06 18:24           ` Christoph Hellwig
2020-11-06 18:24             ` Christoph Hellwig
2020-11-04 22:16 ` Andy Shevchenko
2020-11-23 15:31 ` Guilherme Piccoli
2020-11-23 15:31   ` Guilherme Piccoli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201117153302.GA29293@ashkalra_ubuntu_server \
    --to=ashish.kalra@amd.com \
    --cc=Thomas.Lendacky@amd.com \
    --cc=bp@alien8.de \
    --cc=brijesh.singh@amd.com \
    --cc=dave.hansen@linux-intel.com \
    --cc=hch@lst.de \
    --cc=hpa@zytor.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jon.grimm@amd.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --cc=ssg.sos.patches@amd.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.