linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Baolu Lu <baolu.lu@linux.intel.com>
To: Robin Murphy <robin.murphy@arm.com>, Jason Gunthorpe <jgg@nvidia.com>
Cc: baolu.lu@linux.intel.com, Joerg Roedel <joro@8bytes.org>,
	Kevin Tian <kevin.tian@intel.com>,
	Ashok Raj <ashok.raj@intel.com>,
	Christoph Hellwig <hch@infradead.org>,
	Will Deacon <will@kernel.org>, Liu Yi L <yi.l.liu@intel.com>,
	Jacob jun Pan <jacob.jun.pan@intel.com>,
	iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 01/12] iommu/vt-d: Use iommu_get_domain_for_dev() in debugfs
Date: Wed, 1 Jun 2022 13:33:36 +0800	[thread overview]
Message-ID: <c9289db7-2d5b-4d1e-ca8b-261b12b264f3@linux.intel.com> (raw)
In-Reply-To: <0b7bd793-a3c7-e7e7-8ef0-214dd5b98f05@arm.com>

Hi Robin,

Thank you for the comments.

On 2022/5/31 21:52, Robin Murphy wrote:
> On 2022-05-31 04:02, Baolu Lu wrote:
>> On 2022/5/30 20:14, Jason Gunthorpe wrote:
>>> On Sun, May 29, 2022 at 01:14:46PM +0800, Baolu Lu wrote:

[--snip--]

>> diff --git a/drivers/iommu/intel/debugfs.c 
>> b/drivers/iommu/intel/debugfs.c
>> index d927ef10641b..e6f4835b8d9f 100644
>> --- a/drivers/iommu/intel/debugfs.c
>> +++ b/drivers/iommu/intel/debugfs.c
>> @@ -333,25 +333,28 @@ static void pgtable_walk_level(struct seq_file 
>> *m, struct dma_pte *pde,
>>               continue;
>>
>>           path[level] = pde->val;
>> -        if (dma_pte_superpage(pde) || level == 1)
>> +        if (dma_pte_superpage(pde) || level == 1) {
>>               dump_page_info(m, start, path);
>> -        else
>> -            pgtable_walk_level(m, phys_to_virt(dma_pte_addr(pde)),
>> +        } else {
>> +            unsigned long phys_addr;
>> +
>> +            phys_addr = (unsigned long)dma_pte_addr(pde);
>> +            if (!pfn_valid(__phys_to_pfn(phys_addr)))
> 
> Given that pte_present(pde) passed just above, it was almost certainly a 
> valid entry, so it seems unlikely that the physical address it pointed 
> to could have disappeared in the meantime. If you're worried about the 
> potential case where we've been preempted during this walk for long 
> enough that the page has already been freed by an unmap, reallocated, 

Yes. This is exactly what I am worried about and what this patch wants
to solve.

> and filled with someone else's data that happens to look like valid 
> PTEs, this still isn't enough, since that data could just as well happen 
> to look like valid physical addresses too.
> I imagine that if you want to safely walk pagetables concurrently with 
> them potentially being freed, you'd probably need to get RCU involved.

I don't want to make the map/unmap interface more complex or inefficient
because of a debugfs feature. I hope that the debugfs and map/unmap
interfaces are orthogonal, just like the IOMMU hardware traversing the
page tables, as long as the accessed physical address is valid and
accessible. Otherwise, stop the traversal immediately. If we can't
achieve this, I'd rather stop supporting this debugfs node.

> 
>> +                break;
>> +            pgtable_walk_level(m, phys_to_virt(phys_addr),
> 
> Also, obligatory reminder that pfn_valid() only means that pfn_to_page() 
> gets you a valid struct page. Whether that page is direct-mapped kernel 
> memory or not is a different matter.

Perhaps I can check this from the page flags?

> 
>>                          level - 1, start, path);
>> +        }
>>           path[level] = 0;
>>       }
>>   }
>>
>> -static int show_device_domain_translation(struct device *dev, void 
>> *data)
>> +static int __show_device_domain_translation(struct device *dev, void 
>> *data)
>>   {
>>       struct device_domain_info *info = dev_iommu_priv_get(dev);
>>       struct dmar_domain *domain = info->domain;
>>       struct seq_file *m = data;
>>       u64 path[6] = { 0 };
>>
>> -    if (!domain)
>> -        return 0;
>> -
>>       seq_printf(m, "Device %s @0x%llx\n", dev_name(dev),
>>              (u64)virt_to_phys(domain->pgd));
>>       seq_puts(m, 
>> "IOVA_PFN\t\tPML5E\t\t\tPML4E\t\t\tPDPE\t\t\tPDE\t\t\tPTE\n");
>> @@ -359,20 +362,27 @@ static int show_device_domain_translation(struct 
>> device *dev, void *data)
>>       pgtable_walk_level(m, domain->pgd, domain->agaw + 2, 0, path);
>>       seq_putc(m, '\n');
>>
>> -    return 0;
>> +    return 1;
>>   }
>>
>> -static int domain_translation_struct_show(struct seq_file *m, void 
>> *unused)
>> +static int show_device_domain_translation(struct device *dev, void 
>> *data)
>>   {
>> -    unsigned long flags;
>> -    int ret;
>> +    struct iommu_group *group;
>>
>> -    spin_lock_irqsave(&device_domain_lock, flags);
>> -    ret = bus_for_each_dev(&pci_bus_type, NULL, m,
>> -                   show_device_domain_translation);
>> -    spin_unlock_irqrestore(&device_domain_lock, flags);
>> +    group = iommu_group_get(dev);
>> +    if (group) {
>> +        iommu_group_for_each_dev(group, data,
>> +                     __show_device_domain_translation);
> 
> Why group_for_each_dev?

This will hold the group mutex when the callback is invoked. With the
group mutex hold, the domain could not get changed.

> If there *are* multiple devices in the group 
> then by definition they should be attached to the same domain, so 
> dumping that domain's mappings more than once seems pointless. 
> Especially given that the outer bus_for_each_dev iteration will already 
> visit each individual device anyway, so this would only make the 
> redundancy even worse than it already is.

__show_device_domain_translation() only dumps mappings once as it always
returns 1.

Best regards,
baolu

  parent reply	other threads:[~2022-06-01  5:33 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-27  6:30 [PATCH 00/12] iommu/vt-d: Optimize the use of locks Lu Baolu
2022-05-27  6:30 ` [PATCH 01/12] iommu/vt-d: Use iommu_get_domain_for_dev() in debugfs Lu Baolu
2022-05-27 14:59   ` Jason Gunthorpe
2022-05-29  5:14     ` Baolu Lu
2022-05-30 12:14       ` Jason Gunthorpe
2022-05-31  3:02         ` Baolu Lu
2022-05-31 13:10           ` Jason Gunthorpe
2022-05-31 14:11             ` Baolu Lu
2022-05-31 14:53               ` Jason Gunthorpe
2022-05-31 15:01                 ` Robin Murphy
2022-05-31 15:13                   ` Jason Gunthorpe
2022-05-31 16:01                     ` Robin Murphy
2022-05-31 16:21                       ` Jason Gunthorpe
2022-05-31 18:07                         ` Robin Murphy
2022-05-31 18:51                           ` Jason Gunthorpe
2022-05-31 21:22                             ` Robin Murphy
2022-05-31 23:10                               ` Jason Gunthorpe
2022-06-01  8:53                                 ` Tian, Kevin
2022-06-01 12:18                                 ` Joao Martins
2022-06-01 12:33                                   ` Jason Gunthorpe
2022-06-01 13:52                                     ` Joao Martins
2022-06-01 14:22                                       ` Jason Gunthorpe
2022-06-01  6:39                             ` Baolu Lu
2022-05-31 13:52           ` Robin Murphy
2022-05-31 15:59             ` Jason Gunthorpe
2022-05-31 16:42               ` Robin Murphy
2022-06-01  5:47               ` Baolu Lu
2022-06-01  5:33             ` Baolu Lu [this message]
2022-05-27  6:30 ` [PATCH 02/12] iommu/vt-d: Remove for_each_device_domain() Lu Baolu
2022-05-27 15:00   ` Jason Gunthorpe
2022-06-01  8:53   ` Tian, Kevin
2022-05-27  6:30 ` [PATCH 03/12] iommu/vt-d: Remove clearing translation data in disable_dmar_iommu() Lu Baolu
2022-05-27 15:01   ` Jason Gunthorpe
2022-05-29  5:22     ` Baolu Lu
2022-05-27  6:30 ` [PATCH 04/12] iommu/vt-d: Use pci_get_domain_bus_and_slot() in pgtable_walk() Lu Baolu
2022-05-27 15:01   ` Jason Gunthorpe
2022-06-01  8:56   ` Tian, Kevin
2022-05-27  6:30 ` [PATCH 05/12] iommu/vt-d: Unncessary spinlock for root table alloc and free Lu Baolu
2022-06-01  9:05   ` Tian, Kevin
2022-05-27  6:30 ` [PATCH 06/12] iommu/vt-d: Acquiring lock in domain ID allocation helpers Lu Baolu
2022-06-01  9:09   ` Tian, Kevin
2022-06-01 10:38     ` Baolu Lu
2022-05-27  6:30 ` [PATCH 07/12] iommu/vt-d: Acquiring lock in pasid manipulation helpers Lu Baolu
2022-06-01  9:18   ` Tian, Kevin
2022-06-01 10:48     ` Baolu Lu
2022-05-27  6:30 ` [PATCH 08/12] iommu/vt-d: Replace spin_lock_irqsave() with spin_lock() Lu Baolu
2022-05-27  6:30 ` [PATCH 09/12] iommu/vt-d: Check device list of domain in domain free path Lu Baolu
2022-05-27 15:05   ` Jason Gunthorpe
2022-06-01  9:28   ` Tian, Kevin
2022-06-01 11:02     ` Baolu Lu
2022-06-02  6:29       ` Tian, Kevin
2022-06-06  1:34         ` Baolu Lu
2022-05-27  6:30 ` [PATCH 10/12] iommu/vt-d: Fold __dmar_remove_one_dev_info() into its caller Lu Baolu
2022-05-27  6:30 ` [PATCH 11/12] iommu/vt-d: Use device_domain_lock accurately Lu Baolu
2022-05-27  6:30 ` [PATCH 12/12] iommu/vt-d: Convert device_domain_lock into per-domain mutex Lu Baolu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c9289db7-2d5b-4d1e-ca8b-261b12b264f3@linux.intel.com \
    --to=baolu.lu@linux.intel.com \
    --cc=ashok.raj@intel.com \
    --cc=hch@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jacob.jun.pan@intel.com \
    --cc=jgg@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=will@kernel.org \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).