linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Baolu Lu <baolu.lu@linux.intel.com>,
	Joerg Roedel <joro@8bytes.org>, Kevin Tian <kevin.tian@intel.com>,
	Ashok Raj <ashok.raj@intel.com>,
	Christoph Hellwig <hch@infradead.org>,
	Will Deacon <will@kernel.org>, Liu Yi L <yi.l.liu@intel.com>,
	Jacob jun Pan <jacob.jun.pan@intel.com>,
	iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 01/12] iommu/vt-d: Use iommu_get_domain_for_dev() in debugfs
Date: Tue, 31 May 2022 15:51:10 -0300	[thread overview]
Message-ID: <20220531185110.GJ1343366@nvidia.com> (raw)
In-Reply-To: <10f16c13-c50d-892c-a20d-979b2135c953@arm.com>

On Tue, May 31, 2022 at 07:07:32PM +0100, Robin Murphy wrote:

> > And we expect the iommu driver to be unable to free page table levels
> > that have IOVA boundaries in them?
> 
> I'm not entirely sure what you mean there, but in general an unmap request
> is expected to match some previous map request 

atomic cmpxchg is OK for inserting new page table levels but it can't
protect you against concurrent freeing of page table levels. So
without locks it means that page tables can't usually be freed. Which
seems to match what the Intel driver does - at least from a cursory
look.

This is one of the reasons the mm has the mmap/etc lock and spinlocks
because we do expect page table levels to get wiped out when VMA's are
zap'd - all the different locks provide the protection against page
tables disappearing under from something manipulating them.

Basically every "lockless" walk in (process) MM land is actually
protected by some kind of lock that blocks zap_page_range() from
removing the page table levels themselves.

> They might either unmap the entire region originally mapped, or just
> the requested part, or might fail entirely (IIRC there was some
> nasty code in VFIO for detecting a particular behaviour).

This is something I did differently in iommufd. It always generates
unmaps that are strict supersets of the maps it issued. So it would be
a kernel bug if the driver unmaps more or less than requested.

> Oh, I've spent the last couple of weeks hacking up horrible things
> manipulating entries in init_mm, and never realised that that was actually
> the special case. Oh well, live and learn.

The init_mm is sort of different, it doesn't have zap in quite the
same way, for example. I was talking about the typical process mm.

Anyhow, the right solution is to use RCU as I described before, Baolu
do you want to try?

Jason

  reply	other threads:[~2022-05-31 18:51 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-27  6:30 [PATCH 00/12] iommu/vt-d: Optimize the use of locks Lu Baolu
2022-05-27  6:30 ` [PATCH 01/12] iommu/vt-d: Use iommu_get_domain_for_dev() in debugfs Lu Baolu
2022-05-27 14:59   ` Jason Gunthorpe
2022-05-29  5:14     ` Baolu Lu
2022-05-30 12:14       ` Jason Gunthorpe
2022-05-31  3:02         ` Baolu Lu
2022-05-31 13:10           ` Jason Gunthorpe
2022-05-31 14:11             ` Baolu Lu
2022-05-31 14:53               ` Jason Gunthorpe
2022-05-31 15:01                 ` Robin Murphy
2022-05-31 15:13                   ` Jason Gunthorpe
2022-05-31 16:01                     ` Robin Murphy
2022-05-31 16:21                       ` Jason Gunthorpe
2022-05-31 18:07                         ` Robin Murphy
2022-05-31 18:51                           ` Jason Gunthorpe [this message]
2022-05-31 21:22                             ` Robin Murphy
2022-05-31 23:10                               ` Jason Gunthorpe
2022-06-01  8:53                                 ` Tian, Kevin
2022-06-01 12:18                                 ` Joao Martins
2022-06-01 12:33                                   ` Jason Gunthorpe
2022-06-01 13:52                                     ` Joao Martins
2022-06-01 14:22                                       ` Jason Gunthorpe
2022-06-01  6:39                             ` Baolu Lu
2022-05-31 13:52           ` Robin Murphy
2022-05-31 15:59             ` Jason Gunthorpe
2022-05-31 16:42               ` Robin Murphy
2022-06-01  5:47               ` Baolu Lu
2022-06-01  5:33             ` Baolu Lu
2022-05-27  6:30 ` [PATCH 02/12] iommu/vt-d: Remove for_each_device_domain() Lu Baolu
2022-05-27 15:00   ` Jason Gunthorpe
2022-06-01  8:53   ` Tian, Kevin
2022-05-27  6:30 ` [PATCH 03/12] iommu/vt-d: Remove clearing translation data in disable_dmar_iommu() Lu Baolu
2022-05-27 15:01   ` Jason Gunthorpe
2022-05-29  5:22     ` Baolu Lu
2022-05-27  6:30 ` [PATCH 04/12] iommu/vt-d: Use pci_get_domain_bus_and_slot() in pgtable_walk() Lu Baolu
2022-05-27 15:01   ` Jason Gunthorpe
2022-06-01  8:56   ` Tian, Kevin
2022-05-27  6:30 ` [PATCH 05/12] iommu/vt-d: Unncessary spinlock for root table alloc and free Lu Baolu
2022-06-01  9:05   ` Tian, Kevin
2022-05-27  6:30 ` [PATCH 06/12] iommu/vt-d: Acquiring lock in domain ID allocation helpers Lu Baolu
2022-06-01  9:09   ` Tian, Kevin
2022-06-01 10:38     ` Baolu Lu
2022-05-27  6:30 ` [PATCH 07/12] iommu/vt-d: Acquiring lock in pasid manipulation helpers Lu Baolu
2022-06-01  9:18   ` Tian, Kevin
2022-06-01 10:48     ` Baolu Lu
2022-05-27  6:30 ` [PATCH 08/12] iommu/vt-d: Replace spin_lock_irqsave() with spin_lock() Lu Baolu
2022-05-27  6:30 ` [PATCH 09/12] iommu/vt-d: Check device list of domain in domain free path Lu Baolu
2022-05-27 15:05   ` Jason Gunthorpe
2022-06-01  9:28   ` Tian, Kevin
2022-06-01 11:02     ` Baolu Lu
2022-06-02  6:29       ` Tian, Kevin
2022-06-06  1:34         ` Baolu Lu
2022-05-27  6:30 ` [PATCH 10/12] iommu/vt-d: Fold __dmar_remove_one_dev_info() into its caller Lu Baolu
2022-05-27  6:30 ` [PATCH 11/12] iommu/vt-d: Use device_domain_lock accurately Lu Baolu
2022-05-27  6:30 ` [PATCH 12/12] iommu/vt-d: Convert device_domain_lock into per-domain mutex Lu Baolu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220531185110.GJ1343366@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=ashok.raj@intel.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=hch@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jacob.jun.pan@intel.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=will@kernel.org \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).