linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] iommu: intel: remove flooding of non-error logs, when new-DMA-PTE is the same as old-DMA-PTE.
@ 2021-10-02 12:40 Ajay Garg
  2021-10-02 13:14 ` Lu Baolu
  0 siblings, 1 reply; 10+ messages in thread
From: Ajay Garg @ 2021-10-02 12:40 UTC (permalink / raw)
  To: linux-kernel, iommu, kvm; +Cc: Ajay Garg

Taking a SD-MMC controller (over PCI) as an example, following is an
example sequencing, where the log-flooding happened :

0.
We have a host and a guest, both running latest x86_64 kernels.

1.
Host-machine is booted up (with intel_iommu=on), and the DMA-PTEs
are setup for the controller (on the host), for the first time.

2.
The SD-controller device is added to a (L1) guest on a KVM-VM
(via virt-manager).

3.
The KVM-VM is booted up.

4.
Above triggers a re-setup of DMA-PTEs on the host, for a
second time.

It is observed that the new PTEs formed (on the host) are same
as the original PTEs, and thus following logs, accompanied by
stacktraces, overwhelm the logs :

......
 DMAR: ERROR: DMA PTE for vPFN 0x428ec already set (to 3f6ec003 not 3f6ec003)
 DMAR: ERROR: DMA PTE for vPFN 0x428ed already set (to 3f6ed003 not 3f6ed003)
 DMAR: ERROR: DMA PTE for vPFN 0x428ee already set (to 3f6ee003 not 3f6ee003)
 DMAR: ERROR: DMA PTE for vPFN 0x428ef already set (to 3f6ef003 not 3f6ef003)
 DMAR: ERROR: DMA PTE for vPFN 0x428f0 already set (to 3f6f0003 not 3f6f0003)
......

As the PTEs are same, so there is no cause of concern, and we can easily
avoid the logs-flood for this non-error case.

Signed-off-by: Ajay Garg <ajaygargnsit@gmail.com>
---
 drivers/iommu/intel/iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index d75f59ae28e6..8bea8b4e3ff9 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -2370,7 +2370,7 @@ __domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn,
 		 * touches the iova range
 		 */
 		tmp = cmpxchg64_local(&pte->val, 0ULL, pteval);
-		if (tmp) {
+		if (tmp && (tmp != pteval)) {
 			static int dumps = 5;
 			pr_crit("ERROR: DMA PTE for vPFN 0x%lx already set (to %llx not %llx)\n",
 				iov_pfn, tmp, (unsigned long long)pteval);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] iommu: intel: remove flooding of non-error logs, when new-DMA-PTE is the same as old-DMA-PTE.
  2021-10-02 12:40 [PATCH] iommu: intel: remove flooding of non-error logs, when new-DMA-PTE is the same as old-DMA-PTE Ajay Garg
@ 2021-10-02 13:14 ` Lu Baolu
  2021-10-02 17:18   ` Ajay Garg
  0 siblings, 1 reply; 10+ messages in thread
From: Lu Baolu @ 2021-10-02 13:14 UTC (permalink / raw)
  To: Ajay Garg, linux-kernel, iommu, kvm; +Cc: baolu.lu, Tian, Kevin

On 2021/10/2 20:40, Ajay Garg wrote:
> Taking a SD-MMC controller (over PCI) as an example, following is an
> example sequencing, where the log-flooding happened :
> 
> 0.
> We have a host and a guest, both running latest x86_64 kernels.
> 
> 1.
> Host-machine is booted up (with intel_iommu=on), and the DMA-PTEs
> are setup for the controller (on the host), for the first time.
> 
> 2.
> The SD-controller device is added to a (L1) guest on a KVM-VM
> (via virt-manager).

Isn't the domain should be switched from a default domain to an
unmanaged domain when the device is assigned to the guest?

Even you want to r-setup the same mappings, you need to un-map all
existing mappings, right?

Best regards,
baolu

> 
> 3.
> The KVM-VM is booted up.
> 
> 4.
> Above triggers a re-setup of DMA-PTEs on the host, for a
> second time.
> 
> It is observed that the new PTEs formed (on the host) are same
> as the original PTEs, and thus following logs, accompanied by
> stacktraces, overwhelm the logs :
> 
> ......
>   DMAR: ERROR: DMA PTE for vPFN 0x428ec already set (to 3f6ec003 not 3f6ec003)
>   DMAR: ERROR: DMA PTE for vPFN 0x428ed already set (to 3f6ed003 not 3f6ed003)
>   DMAR: ERROR: DMA PTE for vPFN 0x428ee already set (to 3f6ee003 not 3f6ee003)
>   DMAR: ERROR: DMA PTE for vPFN 0x428ef already set (to 3f6ef003 not 3f6ef003)
>   DMAR: ERROR: DMA PTE for vPFN 0x428f0 already set (to 3f6f0003 not 3f6f0003)
> ......
> 
> As the PTEs are same, so there is no cause of concern, and we can easily
> avoid the logs-flood for this non-error case.
> 
> Signed-off-by: Ajay Garg <ajaygargnsit@gmail.com>
> ---
>   drivers/iommu/intel/iommu.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index d75f59ae28e6..8bea8b4e3ff9 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -2370,7 +2370,7 @@ __domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn,
>   		 * touches the iova range
>   		 */
>   		tmp = cmpxchg64_local(&pte->val, 0ULL, pteval);
> -		if (tmp) {
> +		if (tmp && (tmp != pteval)) {
>   			static int dumps = 5;
>   			pr_crit("ERROR: DMA PTE for vPFN 0x%lx already set (to %llx not %llx)\n",
>   				iov_pfn, tmp, (unsigned long long)pteval);
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] iommu: intel: remove flooding of non-error logs, when new-DMA-PTE is the same as old-DMA-PTE.
  2021-10-02 13:14 ` Lu Baolu
@ 2021-10-02 17:18   ` Ajay Garg
  2021-10-04 22:31     ` Alex Williamson
  0 siblings, 1 reply; 10+ messages in thread
From: Ajay Garg @ 2021-10-02 17:18 UTC (permalink / raw)
  To: Lu Baolu; +Cc: linux-kernel, iommu, kvm, Tian, Kevin

Thanks Lu for the reply.

>
> Isn't the domain should be switched from a default domain to an
> unmanaged domain when the device is assigned to the guest?
>
> Even you want to r-setup the same mappings, you need to un-map all
> existing mappings, right?
>

Hmm, I guess that's a (design) decision the KVM/QEMU/VFIO communities
need to take.
May be the patch could suppress the flooding till then?



Thanks and Regards,
Ajay

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] iommu: intel: remove flooding of non-error logs, when new-DMA-PTE is the same as old-DMA-PTE.
  2021-10-02 17:18   ` Ajay Garg
@ 2021-10-04 22:31     ` Alex Williamson
  2021-10-07  9:03       ` Ajay Garg
  0 siblings, 1 reply; 10+ messages in thread
From: Alex Williamson @ 2021-10-04 22:31 UTC (permalink / raw)
  To: Ajay Garg; +Cc: Lu Baolu, Tian, Kevin, iommu, linux-kernel, kvm

On Sat, 2 Oct 2021 22:48:24 +0530
Ajay Garg <ajaygargnsit@gmail.com> wrote:

> Thanks Lu for the reply.
> 
> >
> > Isn't the domain should be switched from a default domain to an
> > unmanaged domain when the device is assigned to the guest?
> >
> > Even you want to r-setup the same mappings, you need to un-map all
> > existing mappings, right?
> >  
> 
> Hmm, I guess that's a (design) decision the KVM/QEMU/VFIO communities
> need to take.
> May be the patch could suppress the flooding till then?

No, this is wrong.  The pte values should not exist, it doesn't matter
that they're the same.  Is the host driver failing to remove mappings
and somehow they persist in the new vfio owned domain?  There's
definitely a bug beyond logging going on here.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] iommu: intel: remove flooding of non-error logs, when new-DMA-PTE is the same as old-DMA-PTE.
  2021-10-04 22:31     ` Alex Williamson
@ 2021-10-07  9:03       ` Ajay Garg
  2021-10-10  6:15         ` Ajay Garg
  0 siblings, 1 reply; 10+ messages in thread
From: Ajay Garg @ 2021-10-07  9:03 UTC (permalink / raw)
  To: Alex Williamson; +Cc: Lu Baolu, Tian, Kevin, iommu, linux-kernel, kvm

Thanks Alex for the reply.


Lu, Alex :

I got my diagnosis regarding the host-driver wrong, my apologies.
There is no issue with the pci-device's host-driver (confirmed by
preventing the loading of host-driver at host-bootup). Thus, nothing
to be fixed at the host-driver side.

Rather seems some dma mapping/unmapping inconsistency is happening,
when kvm/qemu boots up with the pci-device attached to the guest.

I put up debug-logs in "vfio_iommu_type1_ioctl" method in
"vfio_iommu_type1.c" (on the host-machine).
When the guest boots up, repeated DMA-mappings are observed for the
same address as per the host-machine's logs (without a corresponding
DMA-unmapping first) :

##########################################################################################
ajay@ajay-Latitude-E6320:~$ tail -f /var/log/syslog | grep "ajay: "
Oct  7 14:12:32 ajay-Latitude-E6320 kernel: [  146.202297] ajay:
_MAP_DMA for [0x7ffe724a8670] status [0]
Oct  7 14:12:32 ajay-Latitude-E6320 kernel: [  146.583179] ajay:
_MAP_DMA for [0x7ffe724a8670] status [0]
Oct  7 14:12:32 ajay-Latitude-E6320 kernel: [  146.583253] ajay:
_MAP_DMA for [0x7ffe724a8670] status [0]
Oct  7 14:12:36 ajay-Latitude-E6320 kernel: [  150.105584] ajay:
_MAP_DMA for [0x7ffe724a8670] status [0]
Oct  7 14:13:07 ajay-Latitude-E6320 kernel: [  180.986499] ajay:
_UNMAP_DMA for [0x7ffe724a9840] status [0]
Oct  7 14:13:07 ajay-Latitude-E6320 kernel: [  180.986559] ajay:
_MAP_DMA for [0x7ffe724a97d0] status [0]
Oct  7 14:13:07 ajay-Latitude-E6320 kernel: [  180.986638] ajay:
_MAP_DMA for [0x7ffe724a97d0] status [0]
Oct  7 14:13:07 ajay-Latitude-E6320 kernel: [  181.087359] ajay:
_MAP_DMA for [0x7ffe724a97d0] status [0]
Oct  7 14:13:13 ajay-Latitude-E6320 kernel: [  187.271232] ajay:
_UNMAP_DMA for [0x7fde7b7fcfa0] status [0]
Oct  7 14:13:13 ajay-Latitude-E6320 kernel: [  187.271320] ajay:
_UNMAP_DMA for [0x7fde7b7fcfa0] status [0]
....
##########################################################################################


I'll try and backtrack to the userspace process that is sending these ioctls.


Thanks and Regards,
Ajay






On Tue, Oct 5, 2021 at 4:01 AM Alex Williamson
<alex.williamson@redhat.com> wrote:
>
> On Sat, 2 Oct 2021 22:48:24 +0530
> Ajay Garg <ajaygargnsit@gmail.com> wrote:
>
> > Thanks Lu for the reply.
> >
> > >
> > > Isn't the domain should be switched from a default domain to an
> > > unmanaged domain when the device is assigned to the guest?
> > >
> > > Even you want to r-setup the same mappings, you need to un-map all
> > > existing mappings, right?
> > >
> >
> > Hmm, I guess that's a (design) decision the KVM/QEMU/VFIO communities
> > need to take.
> > May be the patch could suppress the flooding till then?
>
> No, this is wrong.  The pte values should not exist, it doesn't matter
> that they're the same.  Is the host driver failing to remove mappings
> and somehow they persist in the new vfio owned domain?  There's
> definitely a bug beyond logging going on here.  Thanks,
>
> Alex
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] iommu: intel: remove flooding of non-error logs, when new-DMA-PTE is the same as old-DMA-PTE.
  2021-10-07  9:03       ` Ajay Garg
@ 2021-10-10  6:15         ` Ajay Garg
  2021-10-11  6:19           ` Ajay Garg
  0 siblings, 1 reply; 10+ messages in thread
From: Ajay Garg @ 2021-10-10  6:15 UTC (permalink / raw)
  To: Alex Williamson; +Cc: Lu Baolu, Tian, Kevin, iommu, linux-kernel, kvm

> I'll try and backtrack to the userspace process that is sending these ioctls.
>

The userspace process is qemu.

I compiled qemu from latest source, installed via "sudo make install"
on host-machine, rebooted the host-machine, and booted up the
guest-machine on the host-machine. Now, no kernel-flooding is seen on
the host-machine.

For me, the issue is thus closed-invalid; admins may take the
necessary action to officially mark ;)


Thanks and Regards,
Ajay

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] iommu: intel: remove flooding of non-error logs, when new-DMA-PTE is the same as old-DMA-PTE.
  2021-10-10  6:15         ` Ajay Garg
@ 2021-10-11  6:19           ` Ajay Garg
  2021-10-11 14:52             ` Alex Williamson
  0 siblings, 1 reply; 10+ messages in thread
From: Ajay Garg @ 2021-10-11  6:19 UTC (permalink / raw)
  To: Alex Williamson; +Cc: Lu Baolu, Tian, Kevin, iommu, linux-kernel, kvm

The flooding was seen today again, after I booted the host-machine in
the morning.
Need to look what the heck is going on ...

On Sun, Oct 10, 2021 at 11:45 AM Ajay Garg <ajaygargnsit@gmail.com> wrote:
>
> > I'll try and backtrack to the userspace process that is sending these ioctls.
> >
>
> The userspace process is qemu.
>
> I compiled qemu from latest source, installed via "sudo make install"
> on host-machine, rebooted the host-machine, and booted up the
> guest-machine on the host-machine. Now, no kernel-flooding is seen on
> the host-machine.
>
> For me, the issue is thus closed-invalid; admins may take the
> necessary action to officially mark ;)
>
>
> Thanks and Regards,
> Ajay

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] iommu: intel: remove flooding of non-error logs, when new-DMA-PTE is the same as old-DMA-PTE.
  2021-10-11  6:19           ` Ajay Garg
@ 2021-10-11 14:52             ` Alex Williamson
  2021-10-11 18:13               ` Ajay Garg
  0 siblings, 1 reply; 10+ messages in thread
From: Alex Williamson @ 2021-10-11 14:52 UTC (permalink / raw)
  To: Ajay Garg; +Cc: Lu Baolu, Tian, Kevin, iommu, linux-kernel, kvm

On Mon, 11 Oct 2021 11:49:33 +0530
Ajay Garg <ajaygargnsit@gmail.com> wrote:

> The flooding was seen today again, after I booted the host-machine in
> the morning.
> Need to look what the heck is going on ...
> 
> On Sun, Oct 10, 2021 at 11:45 AM Ajay Garg <ajaygargnsit@gmail.com> wrote:
> >  
> > > I'll try and backtrack to the userspace process that is sending these ioctls.
> > >  
> >
> > The userspace process is qemu.
> >
> > I compiled qemu from latest source, installed via "sudo make install"
> > on host-machine, rebooted the host-machine, and booted up the
> > guest-machine on the host-machine. Now, no kernel-flooding is seen on
> > the host-machine.
> >
> > For me, the issue is thus closed-invalid; admins may take the
> > necessary action to officially mark ;)

Even this QEMU explanation doesn't make a lot of sense, vfio tracks
userspace mappings and will return an -EEXIST error for duplicate or
overlapping IOVA entries.  We expect to have an entirely empty IOMMU
domain when a device is assigned, but it seems the only way userspace
can trigger duplicate PTEs would be if mappings already exist, or we
have a bug somewhere.

If the most recent instance is purely on bare metal, then it seems the
host itself has conflicting mappings.  I can only speculate with the
limited data presented, but I'm suspicious there's something happening
with RMRRs here (but that should also entirely preclude assignment).
dmesg, lspci -vvv, and VM configuration would be useful.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] iommu: intel: remove flooding of non-error logs, when new-DMA-PTE is the same as old-DMA-PTE.
  2021-10-11 14:52             ` Alex Williamson
@ 2021-10-11 18:13               ` Ajay Garg
  2021-10-12 14:02                 ` Ajay Garg
  0 siblings, 1 reply; 10+ messages in thread
From: Ajay Garg @ 2021-10-11 18:13 UTC (permalink / raw)
  To: Alex Williamson; +Cc: Lu Baolu, Tian, Kevin, iommu, linux-kernel, kvm

Thanks Alex for your time.

I think I may have found the issue. Right now, when doing a
dma-unmapping, we do a "soft-unmapping" only, as the pte-values
themselves are not cleared in the unlinked pagetable-frame.

I have made the (simple) changes, and things are looking good as of
now (almost an hour now).
However, this time I will give it a day ;)

If there is not a single-flooding observed in the next 24 hours, I
would float the v2 patch for review.


Thanks again for your time and patience.


Thanks and Regards,
Ajay


>
> Even this QEMU explanation doesn't make a lot of sense, vfio tracks
> userspace mappings and will return an -EEXIST error for duplicate or
> overlapping IOVA entries.  We expect to have an entirely empty IOMMU
> domain when a device is assigned, but it seems the only way userspace
> can trigger duplicate PTEs would be if mappings already exist, or we
> have a bug somewhere.
>
> If the most recent instance is purely on bare metal, then it seems the
> host itself has conflicting mappings.  I can only speculate with the
> limited data presented, but I'm suspicious there's something happening
> with RMRRs here (but that should also entirely preclude assignment).
> dmesg, lspci -vvv, and VM configuration would be useful.  Thanks,
>
> Alex
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] iommu: intel: remove flooding of non-error logs, when new-DMA-PTE is the same as old-DMA-PTE.
  2021-10-11 18:13               ` Ajay Garg
@ 2021-10-12 14:02                 ` Ajay Garg
  0 siblings, 0 replies; 10+ messages in thread
From: Ajay Garg @ 2021-10-12 14:02 UTC (permalink / raw)
  To: Alex Williamson; +Cc: Lu Baolu, Tian, Kevin, iommu, linux-kernel, kvm

Hi Alex, Lu.

Posted v2 patch, as per
https://lists.linuxfoundation.org/pipermail/iommu/2021-October/059955.html


Kindly review, and let's continue on that thread now.


Thanks and Regards,
Ajay

On Mon, Oct 11, 2021 at 11:43 PM Ajay Garg <ajaygargnsit@gmail.com> wrote:
>
> Thanks Alex for your time.
>
> I think I may have found the issue. Right now, when doing a
> dma-unmapping, we do a "soft-unmapping" only, as the pte-values
> themselves are not cleared in the unlinked pagetable-frame.
>
> I have made the (simple) changes, and things are looking good as of
> now (almost an hour now).
> However, this time I will give it a day ;)
>
> If there is not a single-flooding observed in the next 24 hours, I
> would float the v2 patch for review.
>
>
> Thanks again for your time and patience.
>
>
> Thanks and Regards,
> Ajay
>
>
> >
> > Even this QEMU explanation doesn't make a lot of sense, vfio tracks
> > userspace mappings and will return an -EEXIST error for duplicate or
> > overlapping IOVA entries.  We expect to have an entirely empty IOMMU
> > domain when a device is assigned, but it seems the only way userspace
> > can trigger duplicate PTEs would be if mappings already exist, or we
> > have a bug somewhere.
> >
> > If the most recent instance is purely on bare metal, then it seems the
> > host itself has conflicting mappings.  I can only speculate with the
> > limited data presented, but I'm suspicious there's something happening
> > with RMRRs here (but that should also entirely preclude assignment).
> > dmesg, lspci -vvv, and VM configuration would be useful.  Thanks,
> >
> > Alex
> >

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-10-12 14:02 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-02 12:40 [PATCH] iommu: intel: remove flooding of non-error logs, when new-DMA-PTE is the same as old-DMA-PTE Ajay Garg
2021-10-02 13:14 ` Lu Baolu
2021-10-02 17:18   ` Ajay Garg
2021-10-04 22:31     ` Alex Williamson
2021-10-07  9:03       ` Ajay Garg
2021-10-10  6:15         ` Ajay Garg
2021-10-11  6:19           ` Ajay Garg
2021-10-11 14:52             ` Alex Williamson
2021-10-11 18:13               ` Ajay Garg
2021-10-12 14:02                 ` Ajay Garg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).