All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lorenzo Pieralisi <lpieralisi@kernel.org>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: ankita@nvidia.com, maz@kernel.org, oliver.upton@linux.dev,
	catalin.marinas@arm.com, will@kernel.org, aniketa@nvidia.com,
	cjia@nvidia.com, kwankhede@nvidia.com, targupta@nvidia.com,
	vsethi@nvidia.com, acurrid@nvidia.com, apopple@nvidia.com,
	jhubbard@nvidia.com, danw@nvidia.com,
	linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1 2/2] KVM: arm64: allow the VM to select DEVICE_* and NORMAL_NC for IO memory
Date: Wed, 13 Sep 2023 17:26:01 +0200	[thread overview]
Message-ID: <ZQHUifAfJ+lZikAn@lpieralisi> (raw)
In-Reply-To: <ZP9MQdRYmlawNsbC@nvidia.com>

On Mon, Sep 11, 2023 at 02:20:01PM -0300, Jason Gunthorpe wrote:
> On Mon, Sep 11, 2023 at 04:57:51PM +0200, Lorenzo Pieralisi wrote:
> > On Thu, Sep 07, 2023 at 11:14:59AM -0700, ankita@nvidia.com wrote:
> > > From: Ankit Agrawal <ankita@nvidia.com>
> > > 
> > > Linux allows device drivers to map IO memory on a per-page basis using
> > > "write combining" or WC. This is often done using
> > > pgprot_writecombing(). The driver knows which pages can support WC
> > 
> > pgprot_writecombine() ?
> > 
> > > access and the proper programming model to generate this IO. Generally
> > > the use case is to boost performance by using write combining to
> > > generate larger PCIe MemWr TLPs.
> > 
> > First off, this changeset does not affect *only* Linux guests, obviously.
> 
> I think everyone understands that. It can be clarified.

OK, this is not a Linux guest fix *only*, everyone understands that
but I would rather make sure that's crystal clear.

> > I understand that's the use case you are after but this change is
> > targeting all VMs, it must be clear.
> > 
> > Then WC and mapping to PCI TLPs, either you describe that in details
> > (NormalNC vs device-nGnRE and resulting SystemBus<->PCI transactions) or
> > you don't describe it at all, as it stands I don't know how to use
> > this information.
> 
> How about another pargraph:
> 
>  KVM prevents all VMs (including Linux) from accessing NORMAL_NC
>  mappings, which is how Linux implements pgprot_writecombine(). This
>  prevents using this performance optimization within VMs.
> 
> I don't think we need to go into details how it works beyond that it
> requires NORMAL_NC.

I think it is worth summarizing the discussion that led to this change,
I can write something up.

> > > Allow VMs to select DEVICE_* or NORMAL_NC on a page by page basis for
> > > all IO memory. This puts the VM in charge of the memory attributes,
> > > and removes the KVM override to DEVICE_nGnRE.
> > > 
> > > Ultimately this makes pgprot_writecombing() work correctly in VMs and
> > 
> > pgprot_writecombine() ?
> > 
> > > allows drivers like mlx5 to fully operate their HW.
> > > 
> > > After some discussions with ARM and CPU architects we reached the
> > > conclusion there was no need for KVM to prevent the VM from selecting
> > > between DEVICE_* and NORMAL_NC for IO memory in VMs. There was a fear
> > > that NORMAL_NC could result in uncontained failures, but upon deeper
> > > analysis it turns out there are already possible cases for uncontained
> > > failures with DEVICE types too. Ultimately the platform must be
> > > implemented in a way that ensures that all DEVICE_* and NORMAL_NC
> > > accesses have no uncontained failures.
> > 
> > I would reorder/rephrase this changelog as follows:
> > 
> > - Describe what the problem is (ie KVM default s2 mappings)
> 
> The problem is that pgprot_writecombine() doesn't work in Linux
> VMs. That is the first pagraph.

s/pagraph/paragraph

Well to me that's how the problem was spotted but OK, I can rewrite
the log and post it here, NP.

> > - Describe how you are solving it
> 
> That is the middle paragraph "Allow VMs to select DEVICE_* or
> NORMAL_NC on a page by page basis"
> 
> > - Add a link to the documentation that states why it is safe to do
> >   that and the resulting s1/s2 mappings combination
> 
> AFAIK there is no documentation beyond the combining rules. Exactly
> what should happen in various error conditions is implementation
> defined. Catalin did you ever find anything?
> 
> > It must be clear why from a legacy standpoint this is a safe change
> > to apply.
> 
> This is why:
>  
> > > Fortunately real platforms do tend to implement this.

Is it possible please to describe what "this" is in details ?

What does "real platforms" mean ?

Let's describe what HW/SW should be implemented to make this
safe.

> It is why it is safe today, because real platforms don't throw

"real platforms", I have a problem with that, see above.

> uncontained errors from typical PCI accesses that VFIO allows. I think
> the conclusions was it turns out that is just because they don't do
> errors at all, not because DEVICE_* prevents it.
> 
> > Remove this sentence, it adds no information for someone who
> > is chasing bugs or just wants to understand the change itself.
> 
> So, if you hit a bug here you might evaluate if there is something
> wrong with your platform, ie it is allowing uncontained errors in
> unexpected places.

Developers looking at commit log in the future must be able to
understand why it was a sound change to make. As it stands IMO
the commit log does not explain it fully.

I can write up the commit log and post it if I manage to summarize
it any better - more important the review on the code (that was already
provided), I will try to write something up asap.

Thanks,
Lorenzo

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: Lorenzo Pieralisi <lpieralisi@kernel.org>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: ankita@nvidia.com, maz@kernel.org, oliver.upton@linux.dev,
	catalin.marinas@arm.com, will@kernel.org, aniketa@nvidia.com,
	cjia@nvidia.com, kwankhede@nvidia.com, targupta@nvidia.com,
	vsethi@nvidia.com, acurrid@nvidia.com, apopple@nvidia.com,
	jhubbard@nvidia.com, danw@nvidia.com,
	linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1 2/2] KVM: arm64: allow the VM to select DEVICE_* and NORMAL_NC for IO memory
Date: Wed, 13 Sep 2023 17:26:01 +0200	[thread overview]
Message-ID: <ZQHUifAfJ+lZikAn@lpieralisi> (raw)
In-Reply-To: <ZP9MQdRYmlawNsbC@nvidia.com>

On Mon, Sep 11, 2023 at 02:20:01PM -0300, Jason Gunthorpe wrote:
> On Mon, Sep 11, 2023 at 04:57:51PM +0200, Lorenzo Pieralisi wrote:
> > On Thu, Sep 07, 2023 at 11:14:59AM -0700, ankita@nvidia.com wrote:
> > > From: Ankit Agrawal <ankita@nvidia.com>
> > > 
> > > Linux allows device drivers to map IO memory on a per-page basis using
> > > "write combining" or WC. This is often done using
> > > pgprot_writecombing(). The driver knows which pages can support WC
> > 
> > pgprot_writecombine() ?
> > 
> > > access and the proper programming model to generate this IO. Generally
> > > the use case is to boost performance by using write combining to
> > > generate larger PCIe MemWr TLPs.
> > 
> > First off, this changeset does not affect *only* Linux guests, obviously.
> 
> I think everyone understands that. It can be clarified.

OK, this is not a Linux guest fix *only*, everyone understands that
but I would rather make sure that's crystal clear.

> > I understand that's the use case you are after but this change is
> > targeting all VMs, it must be clear.
> > 
> > Then WC and mapping to PCI TLPs, either you describe that in details
> > (NormalNC vs device-nGnRE and resulting SystemBus<->PCI transactions) or
> > you don't describe it at all, as it stands I don't know how to use
> > this information.
> 
> How about another pargraph:
> 
>  KVM prevents all VMs (including Linux) from accessing NORMAL_NC
>  mappings, which is how Linux implements pgprot_writecombine(). This
>  prevents using this performance optimization within VMs.
> 
> I don't think we need to go into details how it works beyond that it
> requires NORMAL_NC.

I think it is worth summarizing the discussion that led to this change,
I can write something up.

> > > Allow VMs to select DEVICE_* or NORMAL_NC on a page by page basis for
> > > all IO memory. This puts the VM in charge of the memory attributes,
> > > and removes the KVM override to DEVICE_nGnRE.
> > > 
> > > Ultimately this makes pgprot_writecombing() work correctly in VMs and
> > 
> > pgprot_writecombine() ?
> > 
> > > allows drivers like mlx5 to fully operate their HW.
> > > 
> > > After some discussions with ARM and CPU architects we reached the
> > > conclusion there was no need for KVM to prevent the VM from selecting
> > > between DEVICE_* and NORMAL_NC for IO memory in VMs. There was a fear
> > > that NORMAL_NC could result in uncontained failures, but upon deeper
> > > analysis it turns out there are already possible cases for uncontained
> > > failures with DEVICE types too. Ultimately the platform must be
> > > implemented in a way that ensures that all DEVICE_* and NORMAL_NC
> > > accesses have no uncontained failures.
> > 
> > I would reorder/rephrase this changelog as follows:
> > 
> > - Describe what the problem is (ie KVM default s2 mappings)
> 
> The problem is that pgprot_writecombine() doesn't work in Linux
> VMs. That is the first pagraph.

s/pagraph/paragraph

Well to me that's how the problem was spotted but OK, I can rewrite
the log and post it here, NP.

> > - Describe how you are solving it
> 
> That is the middle paragraph "Allow VMs to select DEVICE_* or
> NORMAL_NC on a page by page basis"
> 
> > - Add a link to the documentation that states why it is safe to do
> >   that and the resulting s1/s2 mappings combination
> 
> AFAIK there is no documentation beyond the combining rules. Exactly
> what should happen in various error conditions is implementation
> defined. Catalin did you ever find anything?
> 
> > It must be clear why from a legacy standpoint this is a safe change
> > to apply.
> 
> This is why:
>  
> > > Fortunately real platforms do tend to implement this.

Is it possible please to describe what "this" is in details ?

What does "real platforms" mean ?

Let's describe what HW/SW should be implemented to make this
safe.

> It is why it is safe today, because real platforms don't throw

"real platforms", I have a problem with that, see above.

> uncontained errors from typical PCI accesses that VFIO allows. I think
> the conclusions was it turns out that is just because they don't do
> errors at all, not because DEVICE_* prevents it.
> 
> > Remove this sentence, it adds no information for someone who
> > is chasing bugs or just wants to understand the change itself.
> 
> So, if you hit a bug here you might evaluate if there is something
> wrong with your platform, ie it is allowing uncontained errors in
> unexpected places.

Developers looking at commit log in the future must be able to
understand why it was a sound change to make. As it stands IMO
the commit log does not explain it fully.

I can write up the commit log and post it if I manage to summarize
it any better - more important the review on the code (that was already
provided), I will try to write something up asap.

Thanks,
Lorenzo

  reply	other threads:[~2023-09-13 15:26 UTC|newest]

Thread overview: 110+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-07 18:14 [PATCH v1 0/2] KVM: arm64: support write combining and cachable IO memory in VMs ankita
2023-09-07 18:14 ` ankita
2023-09-07 18:14 ` [PATCH v1 1/2] KVM: arm64: determine memory type from VMA ankita
2023-09-07 18:14   ` ankita
2023-09-07 19:12   ` Jason Gunthorpe
2023-09-07 19:12     ` Jason Gunthorpe
2023-10-05 16:15   ` Catalin Marinas
2023-10-05 16:15     ` Catalin Marinas
2023-10-05 16:54     ` Jason Gunthorpe
2023-10-05 16:54       ` Jason Gunthorpe
2023-10-10 14:25       ` Catalin Marinas
2023-10-10 14:25         ` Catalin Marinas
2023-10-10 15:05         ` Jason Gunthorpe
2023-10-10 15:05           ` Jason Gunthorpe
2023-10-10 17:19           ` Catalin Marinas
2023-10-10 17:19             ` Catalin Marinas
2023-10-10 18:23             ` Jason Gunthorpe
2023-10-10 18:23               ` Jason Gunthorpe
2023-10-11 17:45               ` Catalin Marinas
2023-10-11 17:45                 ` Catalin Marinas
2023-10-11 18:38                 ` Jason Gunthorpe
2023-10-11 18:38                   ` Jason Gunthorpe
2023-10-12 16:16                   ` Catalin Marinas
2023-10-12 16:16                     ` Catalin Marinas
2024-03-10  3:49                     ` Ankit Agrawal
2024-03-10  3:49                       ` Ankit Agrawal
2024-03-19 13:38                       ` Jason Gunthorpe
2024-03-19 13:38                         ` Jason Gunthorpe
2023-10-23 13:20   ` Shameerali Kolothum Thodi
2023-10-23 13:20     ` Shameerali Kolothum Thodi
2023-09-07 18:14 ` [PATCH v1 2/2] KVM: arm64: allow the VM to select DEVICE_* and NORMAL_NC for IO memory ankita
2023-09-07 18:14   ` ankita
2023-09-08 16:40   ` Catalin Marinas
2023-09-08 16:40     ` Catalin Marinas
2023-09-11 14:57   ` Lorenzo Pieralisi
2023-09-11 14:57     ` Lorenzo Pieralisi
2023-09-11 17:20     ` Jason Gunthorpe
2023-09-11 17:20       ` Jason Gunthorpe
2023-09-13 15:26       ` Lorenzo Pieralisi [this message]
2023-09-13 15:26         ` Lorenzo Pieralisi
2023-09-13 18:54         ` Jason Gunthorpe
2023-09-13 18:54           ` Jason Gunthorpe
2023-09-26  8:31           ` Lorenzo Pieralisi
2023-09-26  8:31             ` Lorenzo Pieralisi
2023-09-26 12:25             ` Jason Gunthorpe
2023-09-26 12:25               ` Jason Gunthorpe
2023-09-26 13:52             ` Catalin Marinas
2023-09-26 13:52               ` Catalin Marinas
2023-09-26 16:12               ` Lorenzo Pieralisi
2023-09-26 16:12                 ` Lorenzo Pieralisi
2023-10-05  9:56               ` Lorenzo Pieralisi
2023-10-05  9:56                 ` Lorenzo Pieralisi
2023-10-05 11:56                 ` Jason Gunthorpe
2023-10-05 11:56                   ` Jason Gunthorpe
2023-10-05 14:08                   ` Lorenzo Pieralisi
2023-10-05 14:08                     ` Lorenzo Pieralisi
2023-10-12 12:35                 ` Will Deacon
2023-10-12 12:35                   ` Will Deacon
2023-10-12 13:20                   ` Jason Gunthorpe
2023-10-12 13:20                     ` Jason Gunthorpe
2023-10-12 14:29                     ` Lorenzo Pieralisi
2023-10-12 14:29                       ` Lorenzo Pieralisi
2023-10-12 13:53                   ` Catalin Marinas
2023-10-12 13:53                     ` Catalin Marinas
2023-10-12 14:48                     ` Will Deacon
2023-10-12 14:48                       ` Will Deacon
2023-10-12 15:44                       ` Jason Gunthorpe
2023-10-12 15:44                         ` Jason Gunthorpe
2023-10-12 16:39                         ` Will Deacon
2023-10-12 16:39                           ` Will Deacon
2023-10-12 18:36                           ` Jason Gunthorpe
2023-10-12 18:36                             ` Jason Gunthorpe
2023-10-13  9:29                             ` Will Deacon
2023-10-13  9:29                               ` Will Deacon
2023-10-12 17:26                       ` Catalin Marinas
2023-10-12 17:26                         ` Catalin Marinas
2023-10-13  9:29                         ` Will Deacon
2023-10-13  9:29                           ` Will Deacon
2023-10-13 13:08                           ` Catalin Marinas
2023-10-13 13:08                             ` Catalin Marinas
2023-10-13 13:45                             ` Jason Gunthorpe
2023-10-13 13:45                               ` Jason Gunthorpe
2023-10-19 11:07                               ` Catalin Marinas
2023-10-19 11:07                                 ` Catalin Marinas
2023-10-19 11:51                                 ` Jason Gunthorpe
2023-10-19 11:51                                   ` Jason Gunthorpe
2023-10-20 11:21                                   ` Catalin Marinas
2023-10-20 11:21                                     ` Catalin Marinas
2023-10-20 11:47                                     ` Jason Gunthorpe
2023-10-20 11:47                                       ` Jason Gunthorpe
2023-10-20 14:03                                       ` Lorenzo Pieralisi
2023-10-20 14:03                                         ` Lorenzo Pieralisi
2023-10-20 14:28                                         ` Jason Gunthorpe
2023-10-20 14:28                                           ` Jason Gunthorpe
2023-10-19 13:35                                 ` Lorenzo Pieralisi
2023-10-19 13:35                                   ` Lorenzo Pieralisi
2023-10-13 15:28                             ` Lorenzo Pieralisi
2023-10-13 15:28                               ` Lorenzo Pieralisi
2023-10-19 11:12                               ` Catalin Marinas
2023-10-19 11:12                                 ` Catalin Marinas
2023-11-09 15:34                             ` Lorenzo Pieralisi
2023-11-09 15:34                               ` Lorenzo Pieralisi
2023-11-10 14:26                               ` Jason Gunthorpe
2023-11-10 14:26                                 ` Jason Gunthorpe
2023-11-13  0:42                                 ` Lorenzo Pieralisi
2023-11-13  0:42                                   ` Lorenzo Pieralisi
2023-11-13 17:41                               ` Catalin Marinas
2023-11-13 17:41                                 ` Catalin Marinas
2023-10-12 12:27   ` Will Deacon
2023-10-12 12:27     ` Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZQHUifAfJ+lZikAn@lpieralisi \
    --to=lpieralisi@kernel.org \
    --cc=acurrid@nvidia.com \
    --cc=aniketa@nvidia.com \
    --cc=ankita@nvidia.com \
    --cc=apopple@nvidia.com \
    --cc=catalin.marinas@arm.com \
    --cc=cjia@nvidia.com \
    --cc=danw@nvidia.com \
    --cc=jgg@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=kvmarm@lists.linux.dev \
    --cc=kwankhede@nvidia.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=targupta@nvidia.com \
    --cc=vsethi@nvidia.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.