xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] AMD/IOMMU: revert "amd/iommu: assign iommu devices to Xen"
@ 2019-06-03 13:00 Jan Beulich
  2019-06-03 13:00 ` [Xen-devel] " Jan Beulich
                   ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: Jan Beulich @ 2019-06-03 13:00 UTC (permalink / raw)
  To: xen-devel; +Cc: Brian Woods, Suravee Suthikulpanit, Roger Pau Monne

This reverts commit b6bd02b7a877f9fac2de69e64d8245d56f92ab25. The change
was redundant with amd_iommu_detect_one_acpi() already calling
pci_ro_device().

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/drivers/passthrough/amd/iommu_init.c
+++ b/xen/drivers/passthrough/amd/iommu_init.c
@@ -1021,8 +1021,6 @@ static void * __init allocate_ppr_log(st
 
 static int __init amd_iommu_init_one(struct amd_iommu *iommu)
 {
-    pci_hide_device(iommu->seg, PCI_BUS(iommu->bdf), PCI_DEVFN2(iommu->bdf));
-
     if ( map_iommu_mmio_region(iommu) != 0 )
         goto error_out;
 



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Xen-devel] [PATCH] AMD/IOMMU: revert "amd/iommu: assign iommu devices to Xen"
  2019-06-03 13:00 [PATCH] AMD/IOMMU: revert "amd/iommu: assign iommu devices to Xen" Jan Beulich
@ 2019-06-03 13:00 ` Jan Beulich
  2019-06-04  8:48 ` Roger Pau Monné
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2019-06-03 13:00 UTC (permalink / raw)
  To: xen-devel; +Cc: Brian Woods, Suravee Suthikulpanit, Roger Pau Monne

This reverts commit b6bd02b7a877f9fac2de69e64d8245d56f92ab25. The change
was redundant with amd_iommu_detect_one_acpi() already calling
pci_ro_device().

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/drivers/passthrough/amd/iommu_init.c
+++ b/xen/drivers/passthrough/amd/iommu_init.c
@@ -1021,8 +1021,6 @@ static void * __init allocate_ppr_log(st
 
 static int __init amd_iommu_init_one(struct amd_iommu *iommu)
 {
-    pci_hide_device(iommu->seg, PCI_BUS(iommu->bdf), PCI_DEVFN2(iommu->bdf));
-
     if ( map_iommu_mmio_region(iommu) != 0 )
         goto error_out;
 



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] AMD/IOMMU: revert "amd/iommu: assign iommu devices to Xen"
  2019-06-03 13:00 [PATCH] AMD/IOMMU: revert "amd/iommu: assign iommu devices to Xen" Jan Beulich
  2019-06-03 13:00 ` [Xen-devel] " Jan Beulich
@ 2019-06-04  8:48 ` Roger Pau Monné
  2019-06-04  8:48   ` [Xen-devel] " Roger Pau Monné
                     ` (2 more replies)
  2019-06-04  9:04 ` Andrew Cooper
  2019-06-19 16:29 ` Woods, Brian
  3 siblings, 3 replies; 20+ messages in thread
From: Roger Pau Monné @ 2019-06-04  8:48 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Brian Woods, Suravee Suthikulpanit

On Mon, Jun 03, 2019 at 07:00:25AM -0600, Jan Beulich wrote:
> This reverts commit b6bd02b7a877f9fac2de69e64d8245d56f92ab25. The change
> was redundant with amd_iommu_detect_one_acpi() already calling
> pci_ro_device().
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

I think this needs to be squashed together with your `AMD/IOMMU: don't
"add" IOMMUs` patch, or else PVH dom0 will break because
update_paging_mode will find devices not behind an IOMMU assigned to
dom0, thus returning an error and crashing dom0.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] [PATCH] AMD/IOMMU: revert "amd/iommu: assign iommu devices to Xen"
  2019-06-04  8:48 ` Roger Pau Monné
@ 2019-06-04  8:48   ` Roger Pau Monné
  2019-06-04  9:29   ` Jan Beulich
  2019-06-04 13:02   ` Jan Beulich
  2 siblings, 0 replies; 20+ messages in thread
From: Roger Pau Monné @ 2019-06-04  8:48 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Brian Woods, Suravee Suthikulpanit

On Mon, Jun 03, 2019 at 07:00:25AM -0600, Jan Beulich wrote:
> This reverts commit b6bd02b7a877f9fac2de69e64d8245d56f92ab25. The change
> was redundant with amd_iommu_detect_one_acpi() already calling
> pci_ro_device().
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

I think this needs to be squashed together with your `AMD/IOMMU: don't
"add" IOMMUs` patch, or else PVH dom0 will break because
update_paging_mode will find devices not behind an IOMMU assigned to
dom0, thus returning an error and crashing dom0.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] AMD/IOMMU: revert "amd/iommu: assign iommu devices to Xen"
  2019-06-03 13:00 [PATCH] AMD/IOMMU: revert "amd/iommu: assign iommu devices to Xen" Jan Beulich
  2019-06-03 13:00 ` [Xen-devel] " Jan Beulich
  2019-06-04  8:48 ` Roger Pau Monné
@ 2019-06-04  9:04 ` Andrew Cooper
  2019-06-04  9:04   ` [Xen-devel] " Andrew Cooper
                     ` (2 more replies)
  2019-06-19 16:29 ` Woods, Brian
  3 siblings, 3 replies; 20+ messages in thread
From: Andrew Cooper @ 2019-06-04  9:04 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Brian Woods, Suravee Suthikulpanit, Roger Pau Monne

On 03/06/2019 14:00, Jan Beulich wrote:
> This reverts commit b6bd02b7a877f9fac2de69e64d8245d56f92ab25. The change
> was redundant with amd_iommu_detect_one_acpi() already calling
> pci_ro_device().

Seeing as amd_iommu_detect_one_acpi() hasn't changed for many years, and
b6bd02b7 was a functional fix for booting PVH on AMD, I can't see what
would make this true now.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] [PATCH] AMD/IOMMU: revert "amd/iommu: assign iommu devices to Xen"
  2019-06-04  9:04 ` Andrew Cooper
@ 2019-06-04  9:04   ` Andrew Cooper
  2019-06-04  9:21   ` Jan Beulich
  2019-06-04 12:08   ` Jan Beulich
  2 siblings, 0 replies; 20+ messages in thread
From: Andrew Cooper @ 2019-06-04  9:04 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: Brian Woods, Suravee Suthikulpanit, Roger Pau Monne

On 03/06/2019 14:00, Jan Beulich wrote:
> This reverts commit b6bd02b7a877f9fac2de69e64d8245d56f92ab25. The change
> was redundant with amd_iommu_detect_one_acpi() already calling
> pci_ro_device().

Seeing as amd_iommu_detect_one_acpi() hasn't changed for many years, and
b6bd02b7 was a functional fix for booting PVH on AMD, I can't see what
would make this true now.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] AMD/IOMMU: revert "amd/iommu: assign iommu devices to Xen"
  2019-06-04  9:04 ` Andrew Cooper
  2019-06-04  9:04   ` [Xen-devel] " Andrew Cooper
@ 2019-06-04  9:21   ` Jan Beulich
  2019-06-04  9:21     ` [Xen-devel] " Jan Beulich
  2019-06-04 12:08   ` Jan Beulich
  2 siblings, 1 reply; 20+ messages in thread
From: Jan Beulich @ 2019-06-04  9:21 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: xen-devel, Brian Woods, Suravee Suthikulpanit, Roger Pau Monne

>>> On 04.06.19 at 11:04, <andrew.cooper3@citrix.com> wrote:
> On 03/06/2019 14:00, Jan Beulich wrote:
>> This reverts commit b6bd02b7a877f9fac2de69e64d8245d56f92ab25. The change
>> was redundant with amd_iommu_detect_one_acpi() already calling
>> pci_ro_device().
> 
> Seeing as amd_iommu_detect_one_acpi() hasn't changed for many years, and
> b6bd02b7 was a functional fix for booting PVH on AMD, I can't see what
> would make this true now.

I'd like to counter this: I can't see why that change was needed.
Clearly, if this was indeed needed for PVH Dom0 boot, there was
insufficient reasoning as to why the existing pci_ro_device()
invocation was not enough. I, for one, surely didn't even recall
it was there when reviewing the patch, or else I would have asked
for a more complete description.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] [PATCH] AMD/IOMMU: revert "amd/iommu: assign iommu devices to Xen"
  2019-06-04  9:21   ` Jan Beulich
@ 2019-06-04  9:21     ` Jan Beulich
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2019-06-04  9:21 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: xen-devel, Brian Woods, Suravee Suthikulpanit, Roger Pau Monne

>>> On 04.06.19 at 11:04, <andrew.cooper3@citrix.com> wrote:
> On 03/06/2019 14:00, Jan Beulich wrote:
>> This reverts commit b6bd02b7a877f9fac2de69e64d8245d56f92ab25. The change
>> was redundant with amd_iommu_detect_one_acpi() already calling
>> pci_ro_device().
> 
> Seeing as amd_iommu_detect_one_acpi() hasn't changed for many years, and
> b6bd02b7 was a functional fix for booting PVH on AMD, I can't see what
> would make this true now.

I'd like to counter this: I can't see why that change was needed.
Clearly, if this was indeed needed for PVH Dom0 boot, there was
insufficient reasoning as to why the existing pci_ro_device()
invocation was not enough. I, for one, surely didn't even recall
it was there when reviewing the patch, or else I would have asked
for a more complete description.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] AMD/IOMMU: revert "amd/iommu: assign iommu devices to Xen"
  2019-06-04  8:48 ` Roger Pau Monné
  2019-06-04  8:48   ` [Xen-devel] " Roger Pau Monné
@ 2019-06-04  9:29   ` Jan Beulich
  2019-06-04  9:29     ` [Xen-devel] " Jan Beulich
  2019-06-04 13:02   ` Jan Beulich
  2 siblings, 1 reply; 20+ messages in thread
From: Jan Beulich @ 2019-06-04  9:29 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Brian Woods, Suravee Suthikulpanit

>>> On 04.06.19 at 10:48, <roger.pau@citrix.com> wrote:
> On Mon, Jun 03, 2019 at 07:00:25AM -0600, Jan Beulich wrote:
>> This reverts commit b6bd02b7a877f9fac2de69e64d8245d56f92ab25. The change
>> was redundant with amd_iommu_detect_one_acpi() already calling
>> pci_ro_device().
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> I think this needs to be squashed together with your `AMD/IOMMU: don't
> "add" IOMMUs` patch, or else PVH dom0 will break because
> update_paging_mode will find devices not behind an IOMMU assigned to
> dom0, thus returning an error and crashing dom0.

How would update_paging_mode() get to see an IOMMU assigned
to Dom0? pci_ro_device() hands it to dom_xen. The point the other
patch you name is about is the fact that
_setup_hwdom_pci_devices() temporarily overrides the assignment
(I'm not convinced this is a good idea in general, but it surely is
needed for e.g. the graphics device that we may also hide) -
otherwise we could simply make amd_iommu_add_device() skip
dom_xen-assigned devices.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] [PATCH] AMD/IOMMU: revert "amd/iommu: assign iommu devices to Xen"
  2019-06-04  9:29   ` Jan Beulich
@ 2019-06-04  9:29     ` Jan Beulich
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2019-06-04  9:29 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Brian Woods, Suravee Suthikulpanit

>>> On 04.06.19 at 10:48, <roger.pau@citrix.com> wrote:
> On Mon, Jun 03, 2019 at 07:00:25AM -0600, Jan Beulich wrote:
>> This reverts commit b6bd02b7a877f9fac2de69e64d8245d56f92ab25. The change
>> was redundant with amd_iommu_detect_one_acpi() already calling
>> pci_ro_device().
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> I think this needs to be squashed together with your `AMD/IOMMU: don't
> "add" IOMMUs` patch, or else PVH dom0 will break because
> update_paging_mode will find devices not behind an IOMMU assigned to
> dom0, thus returning an error and crashing dom0.

How would update_paging_mode() get to see an IOMMU assigned
to Dom0? pci_ro_device() hands it to dom_xen. The point the other
patch you name is about is the fact that
_setup_hwdom_pci_devices() temporarily overrides the assignment
(I'm not convinced this is a good idea in general, but it surely is
needed for e.g. the graphics device that we may also hide) -
otherwise we could simply make amd_iommu_add_device() skip
dom_xen-assigned devices.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] [PATCH] AMD/IOMMU: revert "amd/iommu: assign iommu devices to Xen"
  2019-06-04  9:04 ` Andrew Cooper
  2019-06-04  9:04   ` [Xen-devel] " Andrew Cooper
  2019-06-04  9:21   ` Jan Beulich
@ 2019-06-04 12:08   ` Jan Beulich
  2 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2019-06-04 12:08 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: xen-devel, Brian Woods, Suravee Suthikulpanit, Roger Pau Monne

>>> On 04.06.19 at 11:04, <andrew.cooper3@citrix.com> wrote:
> On 03/06/2019 14:00, Jan Beulich wrote:
>> This reverts commit b6bd02b7a877f9fac2de69e64d8245d56f92ab25. The change
>> was redundant with amd_iommu_detect_one_acpi() already calling
>> pci_ro_device().
> 
> Seeing as amd_iommu_detect_one_acpi() hasn't changed for many years, and
> b6bd02b7 was a functional fix for booting PVH on AMD, I can't see what
> would make this true now.

Just as a side note - amd_iommu_detect_one_acpi() now gets called
quite a bit earlier afaict. Yet I don't think this makes a difference here.
Nevertheless, as things stand the commit I'm suggesting to revert
can at best have papered over an issue elsewhere. Unless I'm missing
something.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] [PATCH] AMD/IOMMU: revert "amd/iommu: assign iommu devices to Xen"
  2019-06-04  8:48 ` Roger Pau Monné
  2019-06-04  8:48   ` [Xen-devel] " Roger Pau Monné
  2019-06-04  9:29   ` Jan Beulich
@ 2019-06-04 13:02   ` Jan Beulich
  2019-06-04 16:20     ` Roger Pau Monné
  2 siblings, 1 reply; 20+ messages in thread
From: Jan Beulich @ 2019-06-04 13:02 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Brian Woods, Suravee Suthikulpanit

>>> On 04.06.19 at 10:48, <roger.pau@citrix.com> wrote:
> On Mon, Jun 03, 2019 at 07:00:25AM -0600, Jan Beulich wrote:
>> This reverts commit b6bd02b7a877f9fac2de69e64d8245d56f92ab25. The change
>> was redundant with amd_iommu_detect_one_acpi() already calling
>> pci_ro_device().
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> I think this needs to be squashed together with your `AMD/IOMMU: don't
> "add" IOMMUs` patch, or else PVH dom0 will break because
> update_paging_mode will find devices not behind an IOMMU assigned to
> dom0, thus returning an error and crashing dom0.

I've taken another look (while correcting the other patch, now sent
as v2): update_paging_mode() iterates over all devices the domain
owns. The IOMMU ones, having been subject of an early
pci_ro_device(), should never end up on Dom0's list. And looking at
the code I also can't - for now at least - see how they could get
moved there. In fact I've verified that they take the "override"
path in _setup_hwdom_pci_devices().

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] [PATCH] AMD/IOMMU: revert "amd/iommu: assign iommu devices to Xen"
  2019-06-04 13:02   ` Jan Beulich
@ 2019-06-04 16:20     ` Roger Pau Monné
  2019-06-05  7:55       ` Jan Beulich
  0 siblings, 1 reply; 20+ messages in thread
From: Roger Pau Monné @ 2019-06-04 16:20 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Brian Woods, Suravee Suthikulpanit

On Tue, Jun 04, 2019 at 07:02:06AM -0600, Jan Beulich wrote:
> >>> On 04.06.19 at 10:48, <roger.pau@citrix.com> wrote:
> > On Mon, Jun 03, 2019 at 07:00:25AM -0600, Jan Beulich wrote:
> >> This reverts commit b6bd02b7a877f9fac2de69e64d8245d56f92ab25. The change
> >> was redundant with amd_iommu_detect_one_acpi() already calling
> >> pci_ro_device().
> >> 
> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> > 
> > I think this needs to be squashed together with your `AMD/IOMMU: don't
> > "add" IOMMUs` patch, or else PVH dom0 will break because
> > update_paging_mode will find devices not behind an IOMMU assigned to
> > dom0, thus returning an error and crashing dom0.
> 
> I've taken another look (while correcting the other patch, now sent
> as v2): update_paging_mode() iterates over all devices the domain
> owns. The IOMMU ones, having been subject of an early
> pci_ro_device(), should never end up on Dom0's list. And looking at
> the code I also can't - for now at least - see how they could get
> moved there. In fact I've verified that they take the "override"
> path in _setup_hwdom_pci_devices().

As you realized this commit was indeed papering over an existing issue
elsewhere. When I did this patch I didn't realize
amd_iommu_detect_one_acpi was calling pci_ro_device.

The issue is that when pci_ro_device gets called by
amd_iommu_detect_one_acpi dom_xen has not been created yet, so
pdev->domain ends up being NULL.

On a tangential note, there's also a dereference of dom_xen in
_pci_hide_device which doesn't trigger a page fault. Do we have
something mapped at linear address 0 on purpose?

The assert in the following diff triggers for me:

[...]
(XEN) PCI: Not using MCFG for segment 0000 bus 00-7f
(XEN) AMD-Vi: Found MSI capability block at 0x64
(XEN) Assertion 'dom_xen' failed at pci.c:453
(XEN) ----[ Xen-4.13-unstable  x86_64  debug=y   Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82d080255b4a>] pci.c#_pci_hide_device+0x63/0x65
(XEN) RFLAGS: 0000000000010046   CONTEXT: hypervisor
(XEN) rax: 0000000000000000   rbx: ffff83042f373a90   rcx: 0000000000000000
(XEN) rdx: ffff82d08049ffff   rsi: 000000000000000a   rdi: ffff82d080493698
(XEN) rbp: ffff82d08049fc68   rsp: ffff82d08049fc58   r8:  ffff82d0804496a0
(XEN) r9:  0000000000000034   r10: 0000000000000220   r11: 0000000000000003
(XEN) r12: ffff83042f373a90   r13: 0000000000000002   r14: ffff83042f3735a0
(XEN) r15: 0000000000000002   cr0: 0000000080050033   cr4: 00000000000000a0
(XEN) cr3: 0000000091e92000   cr2: 0000000000000000
(XEN) fsb: 0000000091a00000   gsb: 0000000000000000   gss: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
(XEN) Xen code around <ffff82d080255b4a> (pci.c#_pci_hide_device+0x63/0x65):
(XEN)  48 83 c4 08 5b 5d eb a4 <0f> 0b 55 48 89 e5 41 57 41 56 41 55 41 54 53 48
(XEN) Xen stack trace from rsp=ffff82d08049fc58:
(XEN)    ffff83042f3735a0 0000000000000000 ffff82d08049fc98 ffff82d080410cc6
(XEN)    0000000000000000 ffff83042f373920 0000000000000064 0000000000000002
(XEN)    ffff82d08049fce8 ffff82d08041453d ffff82d08049fcd8 ffff82d0802671e1
(XEN)    ffff82d08049fcd8 0000000000000030 ffff82c0002195e8 ffff82c0002195b8
(XEN)    0000000000000002 ffff83000009efb0 ffff82d08049fd18 ffff82d080414bf5
(XEN)    ffff82d08049fd18 ffff82d080414b36 0000000000000001 ffff82d0803f8000
(XEN)    ffff82d08049fd48 ffff82d080418d2d 0000000008000000 ffff82c0002195b8
(XEN)    ffff82d08049fd68 0000000000000000 ffff82d08049fd58 ffff82d0804163f9
(XEN)    ffff82d08049fd68 ffff82d080413dfd ffff82d08049fd88 ffff82d08042aa08
(XEN)    000000000202ff7f ffff82d080387000 ffff82d08049fee8 ffff82d080424ece
(XEN)    00000000003d7f00 0000000000000002 0000000000000002 0000000000000002
(XEN)    0000000000000001 0000000000000001 0000000000000001 0000000000000001
(XEN)    0000000000000000 00000000000001ff 0000000001f3f000 0000000001f3ffff
(XEN)    000000000202ff80 0000000001f3f000 000000000202fe00 0000000000000000
(XEN)    ffffffff00800163 000000202ff80000 ffff83000009eee0 ffffff0100000000
(XEN)    000000202fe00000 000000202fe00000 ffff83000009ef80 ffff83000009efb0
(XEN)    0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN)    0000000000000004 00000040ffffffff 0000000000000400 0000000800000000
(XEN)    000000010000006e 0000000000000003 00000000000002f8 0000000000000002
(XEN)    0000000000000000 0000000000000000 0000000000000048 0000000000000000
(XEN) Xen call trace:
(XEN)    [<ffff82d080255b4a>] pci.c#_pci_hide_device+0x63/0x65
(XEN)    [<ffff82d080410cc6>] pci_ro_device+0x5d/0xca
(XEN)    [<ffff82d08041453d>] amd_iommu_detect_one_acpi+0x1ac/0x254
(XEN)    [<ffff82d080414bf5>] iommu_acpi.c#detect_iommu_acpi+0xbf/0xfb
(XEN)    [<ffff82d080418d2d>] acpi_table_parse+0x61/0x90
(XEN)    [<ffff82d0804163f9>] amd_iommu_detect_acpi+0x17/0x19
(XEN)    [<ffff82d080413dfd>] acpi_ivrs_init+0x20/0x5b
(XEN)    [<ffff82d08042aa08>] acpi_boot_init+0x313/0x318
(XEN)    [<ffff82d080424ece>] __start_xen+0x1f64/0x2979
(XEN)    [<ffff82d0802000f3>] __high_start+0x53/0x55
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Assertion 'dom_xen' failed at pci.c:453
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...
(XEN) Resetting with ACPI MEMORY or I/O RESET_REG.
---8<---
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index faf5ccdd95..e66122cf8a 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -448,6 +448,8 @@ static void _pci_hide_device(struct pci_dev *pdev)
 {
     if ( pdev->domain )
         return;
+
+    ASSERT(dom_xen);
     pdev->domain = dom_xen;
     list_add(&pdev->domain_list, &dom_xen->arch.pdev_list);
 }


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] [PATCH] AMD/IOMMU: revert "amd/iommu: assign iommu devices to Xen"
  2019-06-04 16:20     ` Roger Pau Monné
@ 2019-06-05  7:55       ` Jan Beulich
  2019-06-05  8:02         ` Jan Beulich
  2019-06-05  9:06         ` Roger Pau Monné
  0 siblings, 2 replies; 20+ messages in thread
From: Jan Beulich @ 2019-06-05  7:55 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Brian Woods, Suravee Suthikulpanit

>>> On 04.06.19 at 18:20, <roger.pau@citrix.com> wrote:
> On Tue, Jun 04, 2019 at 07:02:06AM -0600, Jan Beulich wrote:
>> >>> On 04.06.19 at 10:48, <roger.pau@citrix.com> wrote:
>> > On Mon, Jun 03, 2019 at 07:00:25AM -0600, Jan Beulich wrote:
>> >> This reverts commit b6bd02b7a877f9fac2de69e64d8245d56f92ab25. The change
>> >> was redundant with amd_iommu_detect_one_acpi() already calling
>> >> pci_ro_device().
>> >> 
>> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> > 
>> > I think this needs to be squashed together with your `AMD/IOMMU: don't
>> > "add" IOMMUs` patch, or else PVH dom0 will break because
>> > update_paging_mode will find devices not behind an IOMMU assigned to
>> > dom0, thus returning an error and crashing dom0.
>> 
>> I've taken another look (while correcting the other patch, now sent
>> as v2): update_paging_mode() iterates over all devices the domain
>> owns. The IOMMU ones, having been subject of an early
>> pci_ro_device(), should never end up on Dom0's list. And looking at
>> the code I also can't - for now at least - see how they could get
>> moved there. In fact I've verified that they take the "override"
>> path in _setup_hwdom_pci_devices().
> 
> As you realized this commit was indeed papering over an existing issue
> elsewhere. When I did this patch I didn't realize
> amd_iommu_detect_one_acpi was calling pci_ro_device.
> 
> The issue is that when pci_ro_device gets called by
> amd_iommu_detect_one_acpi dom_xen has not been created yet, so
> pdev->domain ends up being NULL.

Well, that's being fixed by "adjust system domain creation (and call it
earlier on x86)" (note that it's "special" rather than "system" in the
posted version). pdev->domain remaining to be NULL really is the
smaller of the problems; accessing dom_xen->arch.pdev_list is the
worse part here.

One thing is curious though: So far I thought I would have screwed
up things by having amd_iommu_detect_one_acpi() called earlier,
as mentioned in that patch's description. Your change that I'd like
to revert predates that though by several months, so I'm getting
the impression the issue has been older. As a result the range of
versions to backport this to may have to grow.

> On a tangential note, there's also a dereference of dom_xen in
> _pci_hide_device which doesn't trigger a page fault. Do we have
> something mapped at linear address 0 on purpose?

Yes, during early (legacy) boot. That's how the initial page tables
get constructed. And I did notice it as an actual crash because the
newer box boots from EFI, where there's no mapping at linear
address 0. Hence that patch mentioned above.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] [PATCH] AMD/IOMMU: revert "amd/iommu: assign iommu devices to Xen"
  2019-06-05  7:55       ` Jan Beulich
@ 2019-06-05  8:02         ` Jan Beulich
  2019-06-05  9:52           ` Roger Pau Monné
  2019-06-05  9:06         ` Roger Pau Monné
  1 sibling, 1 reply; 20+ messages in thread
From: Jan Beulich @ 2019-06-05  8:02 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Brian Woods, Suravee Suthikulpanit

>>> On 05.06.19 at 09:55, <JBeulich@suse.com> wrote:
>>>> On 04.06.19 at 18:20, <roger.pau@citrix.com> wrote:
>> On Tue, Jun 04, 2019 at 07:02:06AM -0600, Jan Beulich wrote:
>>> >>> On 04.06.19 at 10:48, <roger.pau@citrix.com> wrote:
>>> > On Mon, Jun 03, 2019 at 07:00:25AM -0600, Jan Beulich wrote:
>>> >> This reverts commit b6bd02b7a877f9fac2de69e64d8245d56f92ab25. The change
>>> >> was redundant with amd_iommu_detect_one_acpi() already calling
>>> >> pci_ro_device().
>>> >> 
>>> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>> > 
>>> > I think this needs to be squashed together with your `AMD/IOMMU: don't
>>> > "add" IOMMUs` patch, or else PVH dom0 will break because
>>> > update_paging_mode will find devices not behind an IOMMU assigned to
>>> > dom0, thus returning an error and crashing dom0.
>>> 
>>> I've taken another look (while correcting the other patch, now sent
>>> as v2): update_paging_mode() iterates over all devices the domain
>>> owns. The IOMMU ones, having been subject of an early
>>> pci_ro_device(), should never end up on Dom0's list. And looking at
>>> the code I also can't - for now at least - see how they could get
>>> moved there. In fact I've verified that they take the "override"
>>> path in _setup_hwdom_pci_devices().
>> 
>> As you realized this commit was indeed papering over an existing issue
>> elsewhere. When I did this patch I didn't realize
>> amd_iommu_detect_one_acpi was calling pci_ro_device.
>> 
>> The issue is that when pci_ro_device gets called by
>> amd_iommu_detect_one_acpi dom_xen has not been created yet, so
>> pdev->domain ends up being NULL.
> 
> Well, that's being fixed by "adjust system domain creation (and call it
> earlier on x86)" (note that it's "special" rather than "system" in the
> posted version). pdev->domain remaining to be NULL really is the
> smaller of the problems; accessing dom_xen->arch.pdev_list is the
> worse part here.
> 
> One thing is curious though: So far I thought I would have screwed
> up things by having amd_iommu_detect_one_acpi() called earlier,
> as mentioned in that patch's description. Your change that I'd like
> to revert predates that though by several months, so I'm getting
> the impression the issue has been older. As a result the range of
> versions to backport this to may have to grow.

And no, I cannot confirm this as the original (4.12) behavior: There
I see iommu_setup() getting called a few lines after
arch_init_memory(). Therefore I'm still unclear what exact problem
the pci_hide_device() addition was really meant to fix.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] [PATCH] AMD/IOMMU: revert "amd/iommu: assign iommu devices to Xen"
  2019-06-05  7:55       ` Jan Beulich
  2019-06-05  8:02         ` Jan Beulich
@ 2019-06-05  9:06         ` Roger Pau Monné
  2019-06-05  9:12           ` Andrew Cooper
  2019-06-05  9:14           ` Jan Beulich
  1 sibling, 2 replies; 20+ messages in thread
From: Roger Pau Monné @ 2019-06-05  9:06 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Brian Woods, Suravee Suthikulpanit

On Wed, Jun 05, 2019 at 01:55:42AM -0600, Jan Beulich wrote:
> >>> On 04.06.19 at 18:20, <roger.pau@citrix.com> wrote:
> > On Tue, Jun 04, 2019 at 07:02:06AM -0600, Jan Beulich wrote:
> >> >>> On 04.06.19 at 10:48, <roger.pau@citrix.com> wrote:
> >> > On Mon, Jun 03, 2019 at 07:00:25AM -0600, Jan Beulich wrote:
> >> >> This reverts commit b6bd02b7a877f9fac2de69e64d8245d56f92ab25. The change
> >> >> was redundant with amd_iommu_detect_one_acpi() already calling
> >> >> pci_ro_device().
> >> >> 
> >> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> >> > 
> >> > I think this needs to be squashed together with your `AMD/IOMMU: don't
> >> > "add" IOMMUs` patch, or else PVH dom0 will break because
> >> > update_paging_mode will find devices not behind an IOMMU assigned to
> >> > dom0, thus returning an error and crashing dom0.
> >> 
> >> I've taken another look (while correcting the other patch, now sent
> >> as v2): update_paging_mode() iterates over all devices the domain
> >> owns. The IOMMU ones, having been subject of an early
> >> pci_ro_device(), should never end up on Dom0's list. And looking at
> >> the code I also can't - for now at least - see how they could get
> >> moved there. In fact I've verified that they take the "override"
> >> path in _setup_hwdom_pci_devices().
> > 
> > As you realized this commit was indeed papering over an existing issue
> > elsewhere. When I did this patch I didn't realize
> > amd_iommu_detect_one_acpi was calling pci_ro_device.
> > 
> > The issue is that when pci_ro_device gets called by
> > amd_iommu_detect_one_acpi dom_xen has not been created yet, so
> > pdev->domain ends up being NULL.
> 
> Well, that's being fixed by "adjust system domain creation (and call it
> earlier on x86)" (note that it's "special" rather than "system" in the
> posted version). pdev->domain remaining to be NULL really is the
> smaller of the problems; accessing dom_xen->arch.pdev_list is the
> worse part here.

Oh, didn't see that series, thanks for the fixes, will try to review
them later today.

> One thing is curious though: So far I thought I would have screwed
> up things by having amd_iommu_detect_one_acpi() called earlier,
> as mentioned in that patch's description. Your change that I'd like
> to revert predates that though by several months, so I'm getting
> the impression the issue has been older. As a result the range of
> versions to backport this to may have to grow.

Yes, I'm still trying to figure out exactly what caused the issue that
b6bd02b7a877 fixed.

> > On a tangential note, there's also a dereference of dom_xen in
> > _pci_hide_device which doesn't trigger a page fault. Do we have
> > something mapped at linear address 0 on purpose?
> 
> Yes, during early (legacy) boot.

FWIW, I'm booting from UEFI using multiboot2 and I didn't get a page
fault in _pci_hide_device.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] [PATCH] AMD/IOMMU: revert "amd/iommu: assign iommu devices to Xen"
  2019-06-05  9:06         ` Roger Pau Monné
@ 2019-06-05  9:12           ` Andrew Cooper
  2019-06-05  9:14           ` Jan Beulich
  1 sibling, 0 replies; 20+ messages in thread
From: Andrew Cooper @ 2019-06-05  9:12 UTC (permalink / raw)
  To: Roger Pau Monné, Jan Beulich
  Cc: xen-devel, Brian Woods, Suravee Suthikulpanit

On 05/06/2019 10:06, Roger Pau Monné wrote:
>>> On a tangential note, there's also a dereference of dom_xen in
>>> _pci_hide_device which doesn't trigger a page fault. Do we have
>>> something mapped at linear address 0 on purpose?
>> Yes, during early (legacy) boot.
> FWIW, I'm booting from UEFI using multiboot2 and I didn't get a page
> fault in _pci_hide_device.

What Jan actually meant was "non-EFI boot".

The Grub2+Multiboot2 will have a mapping of 0, which hides the problem.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] [PATCH] AMD/IOMMU: revert "amd/iommu: assign iommu devices to Xen"
  2019-06-05  9:06         ` Roger Pau Monné
  2019-06-05  9:12           ` Andrew Cooper
@ 2019-06-05  9:14           ` Jan Beulich
  1 sibling, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2019-06-05  9:14 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: xen-devel, Brian Woods, Suravee Suthikulpanit

>>> On 05.06.19 at 11:06, <roger.pau@citrix.com> wrote:
> On Wed, Jun 05, 2019 at 01:55:42AM -0600, Jan Beulich wrote:
>> >>> On 04.06.19 at 18:20, <roger.pau@citrix.com> wrote:
>> > On a tangential note, there's also a dereference of dom_xen in
>> > _pci_hide_device which doesn't trigger a page fault. Do we have
>> > something mapped at linear address 0 on purpose?
>> 
>> Yes, during early (legacy) boot.
> 
> FWIW, I'm booting from UEFI using multiboot2 and I didn't get a page
> fault in _pci_hide_device.

Oh, right, that's also using the pre-constructed page tables iirc.
When I say "EFI boot" most of the time it would mean "native"
EFI mode, not that with another boot loader in the middle. But
yes, I can see how my earlier reply was imprecise / ambiguous.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] [PATCH] AMD/IOMMU: revert "amd/iommu: assign iommu devices to Xen"
  2019-06-05  8:02         ` Jan Beulich
@ 2019-06-05  9:52           ` Roger Pau Monné
  0 siblings, 0 replies; 20+ messages in thread
From: Roger Pau Monné @ 2019-06-05  9:52 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Brian Woods, Suravee Suthikulpanit

On Wed, Jun 05, 2019 at 02:02:43AM -0600, Jan Beulich wrote:
> >>> On 05.06.19 at 09:55, <JBeulich@suse.com> wrote:
> >>>> On 04.06.19 at 18:20, <roger.pau@citrix.com> wrote:
> >> On Tue, Jun 04, 2019 at 07:02:06AM -0600, Jan Beulich wrote:
> >>> >>> On 04.06.19 at 10:48, <roger.pau@citrix.com> wrote:
> >>> > On Mon, Jun 03, 2019 at 07:00:25AM -0600, Jan Beulich wrote:
> >>> >> This reverts commit b6bd02b7a877f9fac2de69e64d8245d56f92ab25. The change
> >>> >> was redundant with amd_iommu_detect_one_acpi() already calling
> >>> >> pci_ro_device().
> >>> >> 
> >>> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> >>> > 
> >>> > I think this needs to be squashed together with your `AMD/IOMMU: don't
> >>> > "add" IOMMUs` patch, or else PVH dom0 will break because
> >>> > update_paging_mode will find devices not behind an IOMMU assigned to
> >>> > dom0, thus returning an error and crashing dom0.
> >>> 
> >>> I've taken another look (while correcting the other patch, now sent
> >>> as v2): update_paging_mode() iterates over all devices the domain
> >>> owns. The IOMMU ones, having been subject of an early
> >>> pci_ro_device(), should never end up on Dom0's list. And looking at
> >>> the code I also can't - for now at least - see how they could get
> >>> moved there. In fact I've verified that they take the "override"
> >>> path in _setup_hwdom_pci_devices().
> >> 
> >> As you realized this commit was indeed papering over an existing issue
> >> elsewhere. When I did this patch I didn't realize
> >> amd_iommu_detect_one_acpi was calling pci_ro_device.
> >> 
> >> The issue is that when pci_ro_device gets called by
> >> amd_iommu_detect_one_acpi dom_xen has not been created yet, so
> >> pdev->domain ends up being NULL.
> > 
> > Well, that's being fixed by "adjust system domain creation (and call it
> > earlier on x86)" (note that it's "special" rather than "system" in the
> > posted version). pdev->domain remaining to be NULL really is the
> > smaller of the problems; accessing dom_xen->arch.pdev_list is the
> > worse part here.
> > 
> > One thing is curious though: So far I thought I would have screwed
> > up things by having amd_iommu_detect_one_acpi() called earlier,
> > as mentioned in that patch's description. Your change that I'd like
> > to revert predates that though by several months, so I'm getting
> > the impression the issue has been older. As a result the range of
> > versions to backport this to may have to grow.
> 
> And no, I cannot confirm this as the original (4.12) behavior: There
> I see iommu_setup() getting called a few lines after
> arch_init_memory(). Therefore I'm still unclear what exact problem
> the pci_hide_device() addition was really meant to fix.

AFAICT this was an error on my side, the commit that fixed the
update_paging_mode related issue is 4250a4416225c, and b6bd02b7a877 is
plain wrong unless I'm missing something.

PVH dom0 AMD support in 4.11 is not working and it's not easy to
bisect the changes between 4.11 and 4.12 in that state.

For the revert:

Acked-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] [PATCH] AMD/IOMMU: revert "amd/iommu: assign iommu devices to Xen"
  2019-06-03 13:00 [PATCH] AMD/IOMMU: revert "amd/iommu: assign iommu devices to Xen" Jan Beulich
                   ` (2 preceding siblings ...)
  2019-06-04  9:04 ` Andrew Cooper
@ 2019-06-19 16:29 ` Woods, Brian
  3 siblings, 0 replies; 20+ messages in thread
From: Woods, Brian @ 2019-06-19 16:29 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Woods, Brian, Suthikulpanit, Suravee, Roger Pau Monne

On Mon, Jun 03, 2019 at 07:00:25AM -0600, Jan Beulich wrote:
> [CAUTION: External Email]
> 
> This reverts commit b6bd02b7a877f9fac2de69e64d8245d56f92ab25. The change
> was redundant with amd_iommu_detect_one_acpi() already calling
> pci_ro_device().
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Brian Woods <brian.woods@amd.com>

> --- a/xen/drivers/passthrough/amd/iommu_init.c
> +++ b/xen/drivers/passthrough/amd/iommu_init.c
> @@ -1021,8 +1021,6 @@ static void * __init allocate_ppr_log(st
> 
>  static int __init amd_iommu_init_one(struct amd_iommu *iommu)
>  {
> -    pci_hide_device(iommu->seg, PCI_BUS(iommu->bdf), PCI_DEVFN2(iommu->bdf));
> -
>      if ( map_iommu_mmio_region(iommu) != 0 )
>          goto error_out;
> 
> 
> 

-- 
Brian Woods

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2019-06-19 16:29 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-03 13:00 [PATCH] AMD/IOMMU: revert "amd/iommu: assign iommu devices to Xen" Jan Beulich
2019-06-03 13:00 ` [Xen-devel] " Jan Beulich
2019-06-04  8:48 ` Roger Pau Monné
2019-06-04  8:48   ` [Xen-devel] " Roger Pau Monné
2019-06-04  9:29   ` Jan Beulich
2019-06-04  9:29     ` [Xen-devel] " Jan Beulich
2019-06-04 13:02   ` Jan Beulich
2019-06-04 16:20     ` Roger Pau Monné
2019-06-05  7:55       ` Jan Beulich
2019-06-05  8:02         ` Jan Beulich
2019-06-05  9:52           ` Roger Pau Monné
2019-06-05  9:06         ` Roger Pau Monné
2019-06-05  9:12           ` Andrew Cooper
2019-06-05  9:14           ` Jan Beulich
2019-06-04  9:04 ` Andrew Cooper
2019-06-04  9:04   ` [Xen-devel] " Andrew Cooper
2019-06-04  9:21   ` Jan Beulich
2019-06-04  9:21     ` [Xen-devel] " Jan Beulich
2019-06-04 12:08   ` Jan Beulich
2019-06-19 16:29 ` Woods, Brian

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).