LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Shanker R Donthineni <sdonthineni@nvidia.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, Vikram Sethi <vsethi@nvidia.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	Mark Kettenis <mark.kettenis@xs4all.nl>,
	"christoffer.dall@arm.com" <christoffer.dall@arm.com>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	"kvmarm@lists.cs.columbia.edu" <kvmarm@lists.cs.columbia.edu>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Jason Sequeira <jsequeira@nvidia.com>
Subject: Re: [RFC 1/2] vfio/pci: keep the prefetchable attribute of a BAR region in VMA
Date: Wed, 02 Jun 2021 10:37:58 +0100
Message-ID: <878s3s1ua1.wl-maz@kernel.org> (raw)
In-Reply-To: <273ba1c2-dfe6-7dc1-3e40-03398e82469b@nvidia.com>

Hi Shanker,

On Sat, 08 May 2021 17:33:11 +0100,
Shanker R Donthineni <sdonthineni@nvidia.com> wrote:
> 
> Hi Marc,
> 
> On 5/5/21 1:02 PM, Catalin Marinas wrote:
> >>> Will/Catalin, perhaps you could explain your thought process on why you chose
> >>> Normal NC for ioremap_wc on the armv8 linux port instead of Device GRE or other
> >>> Device Gxx.
> >> I think a combination of: compatibility with 32-bit Arm, the need to
> >> support unaligned accesses and the potential for higher performance.
> > IIRC the _wc suffix also matches the pgprot_writecombine() used by some
> > drivers to map a video framebuffer into user space. Accesses to the
> > framebuffer are not guaranteed to be aligned (memset/memcpy don't ensure
> > alignment on arm64 and the user doesn't have a memset_io or memcpy_toio).
> >
> >> Furthermore, ioremap() already gives you a Device memory type, and we're
> >> tight on MAIR space.
> > We have MT_DEVICE_GRE currently reserved though no in-kernel user, we
> > might as well remove it.
> @Marc, Could you provide your thoughts/guidance for the next step? The
> proposal of getting hints for prefetchable regions from VFIO/QEMU is not
> recommended, The only option left is to implement ARM64 dependent logic
> in KVM.
> 
> Option-1: I think we could take advantage of stage-1/2 combining rules to
> allow NORMAL_NC memory-type for device memory in VM. Always map
> device memory at stage-2 as NORMAL-NC and trust VM's stage-1 MT.
> 
> ---------------------------------------------------------------
> Stage-2 MT     Stage-1 MT    Resultant MT (combining-rules/FWB)
> ---------------------------------------------------------------
> Normal-NC      Normal-WT           Normal-NC
>    -           Normal-WB              -
>    -           Normal-NC              -
>    -           Device-<attr>       Device-<attr>
> ---------------------------------------------------------------

I think this is unwise.

Will recently debugged a pretty horrible situation when doing exactly
that: when S1 is off and S2 is on, the I-side is allowed to generate
speculative accesses (see ARMv8 ARM G.a D5.2.9 for the details). And
yes, implementations definitely do that. Add side-effect reads to the
mix, and you're in for a treat.

> We've been using this option internally for testing purpose and
> validated with NVME/Mellanox/GPU pass-through devices on
> Marvell-Thundex2 platform.

See above. It *will* break eventually.

> Option-2: Get resource properties associated with MMIO using lookup_resource()
> and map at stage-2 as Normal-NC if IORESOURCE_PREFETCH is set in flags.

That's a pretty roundabout way of doing exactly the same thing you
initially proposed. And it suffers from the exact same problems, which
is that you change the semantics of the mapping without knowing what
the guest's intent is.

	M.

-- 
Without deviation from the norm, progress is not possible.

  reply index

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-29 16:29 [RFC 0/2] [RFC] Honor PCI prefetchable attributes for a virtual machine on ARM64 Shanker Donthineni
2021-04-29 16:29 ` [RFC 1/2] vfio/pci: keep the prefetchable attribute of a BAR region in VMA Shanker Donthineni
2021-04-29 18:28   ` Alex Williamson
2021-04-29 19:14     ` Shanker R Donthineni
2021-04-29 19:46       ` Alex Williamson
2021-04-29 22:08         ` Vikram Sethi
2021-04-30 11:25         ` Shanker R Donthineni
     [not found]           ` <87czucngdc.wl-maz@kernel.org>
2021-04-30 13:07             ` Shanker R Donthineni
2021-04-30 14:58             ` Shanker R Donthineni
     [not found]               ` <878s4zokll.wl-maz@kernel.org>
2021-04-30 16:57                 ` Vikram Sethi
2021-05-01  9:30                   ` Marc Zyngier
2021-05-01 11:36                     ` Shanker R Donthineni
     [not found]                       ` <87czu8uowe.wl-maz@kernel.org>
2021-05-03 12:08                         ` Shanker R Donthineni
2021-05-02 17:56                     ` Vikram Sethi
2021-05-03 10:17                       ` Marc Zyngier
2021-05-03 13:35                         ` Mark Kettenis
2021-05-03 13:59                           ` Vikram Sethi
2021-05-03 14:44                             ` Alex Williamson
2021-05-03 22:03                               ` Vikram Sethi
2021-05-04  8:30                                 ` Will Deacon
2021-05-05 18:02                                   ` Catalin Marinas
2021-05-06  7:22                                     ` Christoph Hellwig
2021-05-08 16:33                                     ` Shanker R Donthineni
2021-06-02  9:37                                       ` Marc Zyngier [this message]
2021-05-04 18:03                                 ` Alex Williamson
2021-06-02  9:11                                   ` Marc Zyngier
2021-04-30  9:54   ` Lorenzo Pieralisi
2021-04-30 12:38     ` Jason Gunthorpe
2021-04-29 16:29 ` [RFC 2/2] KVM: arm64: Add write-combine support for stage-2 entries Shanker Donthineni
2021-05-03  7:01 ` [RFC 0/2] [RFC] Honor PCI prefetchable attributes for a virtual machine on ARM64 Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=878s3s1ua1.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=alex.williamson@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=christoffer.dall@arm.com \
    --cc=jsequeira@nvidia.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.kettenis@xs4all.nl \
    --cc=sdonthineni@nvidia.com \
    --cc=vsethi@nvidia.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git
	git clone --mirror https://lore.kernel.org/lkml/10 lkml/git/10.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git