linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Robin Murphy <robin.murphy@arm.com>,
	Joao Martins <joao.m.martins@oracle.com>
Cc: Baolu Lu <baolu.lu@linux.intel.com>,
	Joerg Roedel <joro@8bytes.org>, Kevin Tian <kevin.tian@intel.com>,
	Ashok Raj <ashok.raj@intel.com>,
	Christoph Hellwig <hch@infradead.org>,
	Will Deacon <will@kernel.org>, Liu Yi L <yi.l.liu@intel.com>,
	Jacob jun Pan <jacob.jun.pan@intel.com>,
	iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 01/12] iommu/vt-d: Use iommu_get_domain_for_dev() in debugfs
Date: Tue, 31 May 2022 20:10:39 -0300	[thread overview]
Message-ID: <20220531231039.GO1343366@nvidia.com> (raw)
In-Reply-To: <56bbbad7-bcba-a440-692b-64e50b4eddf8@arm.com>

On Tue, May 31, 2022 at 10:22:32PM +0100, Robin Murphy wrote:

> There are only 3 instances where we'll free a table while the domain is
> live. The first is the one legitimate race condition, where two map requests
> targeting relatively nearby PTEs both go to fill in an intermediate level of
> table; whoever loses that race frees the table they allocated, but it was
> never visible to anyone else so that's definitely fine. The second is if
> we're mapping a block entry, and find that there's already a table entry
> there, wherein we assume the table must be empty, clear the entry,
> invalidate any walk caches, install the block entry, then free the orphaned
> table; since we're mapping the entire IOVA range covered by that table,
> there should be no other operations on that IOVA range attempting to walk
> the table at the same time, so it's fine. 

I saw these two in the Intel driver

> The third is effectively the inverse, if we get a block-sized unmap
> but find a table entry rather than a block at that point (on the
> assumption that it's de-facto allowed for a single unmap to cover
> multiple adjacent mappings as long as it does so exactly); similarly
> we assume that the table must be full, and no other operations
> should be racing because we're unmapping its whole IOVA range, so we
> remove the table entry, invalidate, and free as before.

Not sure I noticed this one though

This all it all makes sense though.

> Although we don't have debug dumping for io-pgtable-arm, it's good to be
> thinking about this, since it's made me realise that dirty-tracking sweeps
> per that proposal might pose a similar kind of concern, so we might still
> need to harden these corners for the sake of that.

Lets make sure Joao sees this..

It is interesting because we probably don't want the big latency
spikes that would come from using locking to block map/unmap while
dirty reading is happening - eg at the iommfd level.

From a consistency POV the only case that matters is unmap and unmap
should already be doing a dedicated dirty read directly prior to the
unmap (as per that other long thread)

So having safe racy reading in the kernel is probably best, and so RCU
would be good here too.

> that somewhere I have some half-finished patches making io-pgtable-arm use
> the iommu_iotlb_gather freelist, so maybe I'll tackle both concerns at once
> (perhaps we might even be able to RCU-ify the freelist generically? I'll see
> how it goes when I get there).

This is all very similar to how the mm's tlb gather stuff works,
especially on PPC which requires RCU protection of page tables instead
of the IPI trick.

Currently the rcu_head and the lru overlap in the struct page. To do
this I'd suggest following what was done for slab - see ca1a46d6f506
("Merge tag 'slab-for-5.17'..) and create a 'struct page' sub-class
like 'struct slab' for use with iommu page tables and put the
list_head and rcu_head sequentially in the struct iommu_pt_page.

Continue to use put_pages_list() just with the list_head in the new
struct not the lru.

If we need it for dirty tracking then it makes sense to start
progressing parts at least..

Jason

  reply	other threads:[~2022-05-31 23:11 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-27  6:30 [PATCH 00/12] iommu/vt-d: Optimize the use of locks Lu Baolu
2022-05-27  6:30 ` [PATCH 01/12] iommu/vt-d: Use iommu_get_domain_for_dev() in debugfs Lu Baolu
2022-05-27 14:59   ` Jason Gunthorpe
2022-05-29  5:14     ` Baolu Lu
2022-05-30 12:14       ` Jason Gunthorpe
2022-05-31  3:02         ` Baolu Lu
2022-05-31 13:10           ` Jason Gunthorpe
2022-05-31 14:11             ` Baolu Lu
2022-05-31 14:53               ` Jason Gunthorpe
2022-05-31 15:01                 ` Robin Murphy
2022-05-31 15:13                   ` Jason Gunthorpe
2022-05-31 16:01                     ` Robin Murphy
2022-05-31 16:21                       ` Jason Gunthorpe
2022-05-31 18:07                         ` Robin Murphy
2022-05-31 18:51                           ` Jason Gunthorpe
2022-05-31 21:22                             ` Robin Murphy
2022-05-31 23:10                               ` Jason Gunthorpe [this message]
2022-06-01  8:53                                 ` Tian, Kevin
2022-06-01 12:18                                 ` Joao Martins
2022-06-01 12:33                                   ` Jason Gunthorpe
2022-06-01 13:52                                     ` Joao Martins
2022-06-01 14:22                                       ` Jason Gunthorpe
2022-06-01  6:39                             ` Baolu Lu
2022-05-31 13:52           ` Robin Murphy
2022-05-31 15:59             ` Jason Gunthorpe
2022-05-31 16:42               ` Robin Murphy
2022-06-01  5:47               ` Baolu Lu
2022-06-01  5:33             ` Baolu Lu
2022-05-27  6:30 ` [PATCH 02/12] iommu/vt-d: Remove for_each_device_domain() Lu Baolu
2022-05-27 15:00   ` Jason Gunthorpe
2022-06-01  8:53   ` Tian, Kevin
2022-05-27  6:30 ` [PATCH 03/12] iommu/vt-d: Remove clearing translation data in disable_dmar_iommu() Lu Baolu
2022-05-27 15:01   ` Jason Gunthorpe
2022-05-29  5:22     ` Baolu Lu
2022-05-27  6:30 ` [PATCH 04/12] iommu/vt-d: Use pci_get_domain_bus_and_slot() in pgtable_walk() Lu Baolu
2022-05-27 15:01   ` Jason Gunthorpe
2022-06-01  8:56   ` Tian, Kevin
2022-05-27  6:30 ` [PATCH 05/12] iommu/vt-d: Unncessary spinlock for root table alloc and free Lu Baolu
2022-06-01  9:05   ` Tian, Kevin
2022-05-27  6:30 ` [PATCH 06/12] iommu/vt-d: Acquiring lock in domain ID allocation helpers Lu Baolu
2022-06-01  9:09   ` Tian, Kevin
2022-06-01 10:38     ` Baolu Lu
2022-05-27  6:30 ` [PATCH 07/12] iommu/vt-d: Acquiring lock in pasid manipulation helpers Lu Baolu
2022-06-01  9:18   ` Tian, Kevin
2022-06-01 10:48     ` Baolu Lu
2022-05-27  6:30 ` [PATCH 08/12] iommu/vt-d: Replace spin_lock_irqsave() with spin_lock() Lu Baolu
2022-05-27  6:30 ` [PATCH 09/12] iommu/vt-d: Check device list of domain in domain free path Lu Baolu
2022-05-27 15:05   ` Jason Gunthorpe
2022-06-01  9:28   ` Tian, Kevin
2022-06-01 11:02     ` Baolu Lu
2022-06-02  6:29       ` Tian, Kevin
2022-06-06  1:34         ` Baolu Lu
2022-05-27  6:30 ` [PATCH 10/12] iommu/vt-d: Fold __dmar_remove_one_dev_info() into its caller Lu Baolu
2022-05-27  6:30 ` [PATCH 11/12] iommu/vt-d: Use device_domain_lock accurately Lu Baolu
2022-05-27  6:30 ` [PATCH 12/12] iommu/vt-d: Convert device_domain_lock into per-domain mutex Lu Baolu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220531231039.GO1343366@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=ashok.raj@intel.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=hch@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jacob.jun.pan@intel.com \
    --cc=joao.m.martins@oracle.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=will@kernel.org \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).