LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Vikram Sethi <vsethi@nvidia.com>
To: Marc Zyngier <maz@kernel.org>,
	Shanker Donthineni <sdonthineni@nvidia.com>
Cc: Alex Williamson <alex.williamson@redhat.com>,
	Will Deacon <will@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Christoffer Dall <christoffer.dall@arm.com>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	"kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Jason Sequeira <jsequeira@nvidia.com>
Subject: RE: [RFC 1/2] vfio/pci: keep the prefetchable attribute of a BAR region in VMA
Date: Fri, 30 Apr 2021 16:57:14 +0000
Message-ID: <BL0PR12MB2532CC436EBF626966B15994BD5E9@BL0PR12MB2532.namprd12.prod.outlook.com> (raw)
In-Reply-To: <878s4zokll.wl-maz@kernel.org>

Hi Marc, 

> -----Original Message-----
> From: Marc Zyngier <maz@kernel.org>
> Sent: Friday, April 30, 2021 10:31 AM
> On Fri, 30 Apr 2021 15:58:14 +0100,
> Shanker R Donthineni <sdonthineni@nvidia.com> wrote:
> >
> > Hi Marc,
> >
> > On 4/30/21 6:47 AM, Marc Zyngier wrote:
> > >
> > >>>> We've two concerns here:
> > >>>>    - Performance impacts for pass-through devices.
> > >>>>    - The definition of ioremap_wc() function doesn't match the
> > >>>> host kernel on ARM64
> > >>> Performance I can understand, but I think you're also using it to
> > >>> mask a driver bug which should be resolved first.  Thank
> > >> We’ve already instrumented the driver code and found the code path
> > >> for the unaligned accesses. We’ll fix this issue if it’s not
> > >> following WC semantics.
> > >>
> > >> Fixing the performance concern will be under KVM stage-2 page-table
> > >> control. We're looking for a guidance/solution for updating stage-2
> > >> PTE based on PCI-BAR attribute.
> > > Before we start discussing the *how*, I'd like to clearly understand
> > > what *arm64* memory attributes you are relying on. We already have
> > > established that the unaligned access was a bug, which was the
> > > biggest argument in favour of NORMAL_NC. What are the other
> requirements?
> > Sorry, my earlier response was not complete...
> >
> > ARMv8 architecture has two features Gathering and Reorder
> > transactions, very important from a performance point of view. Small
> > inline packets for NIC cards and accesses to GPU's frame buffer are
> > CPU-bound operations. We want to take advantages of GRE features to
> > achieve higher performance.
> >
> > Both these features are disabled for prefetchable BARs in VM because
> > memory-type MT_DEVICE_nGnRE enforced in stage-2.
> 
> Right, so Normal_NC is a red herring, and it is Device_GRE that you really are
> after, right?
> 
I think Device GRE has some practical problems. 
1. A lot of userspace code which is used to getting write combined mappings
to GPU memory from kernel drivers does memcpy/memset on it which 
can insert ldp/stp which can crash on Device Memory Type. From a quick search
I didn't find a memcpy_io or memset_io in glibc. Perhaps there are some 
other functions available, but a lot of userspace applications that work on x86 and
ARM baremetal won't work on ARM VMs without such changes. Changes to all of 
userspace may not always be practical, specially if linking to binaries

2. Sometimes even if application is not using memset/memcpy directly, 
gcc may insert a builtin memcpy/memset. 

3. Recompiling all applications with gcc -m strict-align has performance issues. 
In our experiments that resulted in an increase in code size, and also 3-5% 
performance decrease reliably.
Also, it is not always practical to recompile all of userspace, depending on
who owns the code/linked binaries etc.

From KVM-ARM point of view, what is it about Normal NC at stage 2 for
Prefetchable BAR (however KVM gets the hint, whether from userspace or VMA)
that is undesirable vs Device GRE? I couldn't think of a difference to devices
whether the combining or prefetching or reordering happened because of one or
the other. 

> Now, I'm not convinced that we can do that directly from vfio in a device-
> agnostic manner. It is userspace that places the device in the guest's
> memory, and I have the ugly feeling that userspace needs to be in control of
> memory attributes.
> 
> Otherwise, we change the behaviour for all existing devices that have
> prefetchable BARs, and I don't think that's an acceptable move (userspace
> ABI change).
> 
>         M.
> 
> --
> Without deviation from the norm, progress is not possible.

  parent reply index

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-29 16:29 [RFC 0/2] [RFC] Honor PCI prefetchable attributes for a virtual machine on ARM64 Shanker Donthineni
2021-04-29 16:29 ` [RFC 1/2] vfio/pci: keep the prefetchable attribute of a BAR region in VMA Shanker Donthineni
2021-04-29 18:28   ` Alex Williamson
2021-04-29 19:14     ` Shanker R Donthineni
2021-04-29 19:46       ` Alex Williamson
2021-04-29 22:08         ` Vikram Sethi
2021-04-30 11:25         ` Shanker R Donthineni
     [not found]           ` <87czucngdc.wl-maz@kernel.org>
2021-04-30 13:07             ` Shanker R Donthineni
2021-04-30 14:58             ` Shanker R Donthineni
     [not found]               ` <878s4zokll.wl-maz@kernel.org>
2021-04-30 16:57                 ` Vikram Sethi [this message]
2021-05-01  9:30                   ` Marc Zyngier
2021-05-01 11:36                     ` Shanker R Donthineni
     [not found]                       ` <87czu8uowe.wl-maz@kernel.org>
2021-05-03 12:08                         ` Shanker R Donthineni
2021-05-02 17:56                     ` Vikram Sethi
2021-05-03 10:17                       ` Marc Zyngier
2021-05-03 13:35                         ` Mark Kettenis
2021-05-03 13:59                           ` Vikram Sethi
2021-05-03 14:44                             ` Alex Williamson
2021-05-03 22:03                               ` Vikram Sethi
2021-05-04  8:30                                 ` Will Deacon
2021-05-05 18:02                                   ` Catalin Marinas
2021-05-06  7:22                                     ` Christoph Hellwig
2021-05-08 16:33                                     ` Shanker R Donthineni
2021-06-02  9:37                                       ` Marc Zyngier
2021-05-04 18:03                                 ` Alex Williamson
2021-06-02  9:11                                   ` Marc Zyngier
2021-04-30  9:54   ` Lorenzo Pieralisi
2021-04-30 12:38     ` Jason Gunthorpe
2021-04-29 16:29 ` [RFC 2/2] KVM: arm64: Add write-combine support for stage-2 entries Shanker Donthineni
2021-05-03  7:01 ` [RFC 0/2] [RFC] Honor PCI prefetchable attributes for a virtual machine on ARM64 Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BL0PR12MB2532CC436EBF626966B15994BD5E9@BL0PR12MB2532.namprd12.prod.outlook.com \
    --to=vsethi@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=christoffer.dall@arm.com \
    --cc=jsequeira@nvidia.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=sdonthineni@nvidia.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git
	git clone --mirror https://lore.kernel.org/lkml/10 lkml/git/10.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git