iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* AMD-Vi: Event logged [IO_PAGE_FAULT device=42:00.0 domain=0x005e address=0xfffffffdf8030000 flags=0x0008]
@ 2020-12-03  6:18 Marc Smith
  2020-12-04 15:10 ` Marc Smith
  0 siblings, 1 reply; 2+ messages in thread
From: Marc Smith @ 2020-12-03  6:18 UTC (permalink / raw)
  To: iommu

Hi,

First, I must preface this email by apologizing in advance for asking
about a distro kernel (RHEL in this case); so not truly reporting this
problem and requesting a fix here (I know this should be taken up with
the vendor), rather hoping someone can give me a few hints/pointers on
where to look next for debugging this issue.

I'm using RHEL 7.8.2003 (CentOS) with a 3.10.0-1127.18.2.el7 kernel.
The systems use a Supermicro H12SSW-NT board (AMD), and we have the
IOMMU enabled along with SR-IOV. I have several virtual machines (QEMU
KVM) that run on these servers, and I'm passing PCIe end-points into
the VMs (in some cases the whole PCIe EP itself, and for some devices
I use SR-IOV and pass in the VFs to the VMs). The VM's run Linux as
their guest OS (a couple different distros).

While the servers (VMs) are idle, I don't experience any problems. But
when I start doing a lot of I/O in the virtual machines (iSCSI across
Ethernet interfaces, disk I/O via SAS HBAs that are passed into the
VM, etc.) I notice the following after some time at the host layer
("hypervisor"):
Nov 29 10:50:00 node1 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT
device=42:00.0 domain=0x005e address=0xfffffffdf8030000 flags=0x0008]
Nov 29 22:02:03 node1 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT
device=c8:02.1 domain=0x005f address=0xfffffffdf8060000 flags=0x0008]
Nov 30 02:13:54 node1 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT
device=42:00.0 domain=0x005e address=0xfffffffdf8020000 flags=0x0008]
Nov 30 02:28:44 node1 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT
device=c8:02.0 domain=0x005e address=0xfffffffdf8020000 flags=0x0008]
Nov 30 10:48:53 node1 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT
device=01:00.0 domain=0x005e address=0xfffffffdf8040000 flags=0x0008]
Dec  2 07:05:22 node1 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT
device=c8:03.0 domain=0x005e address=0xfffffffdf8010000 flags=0x0008]

These events happen to all PCIe devices that are passed into the VMs,
although not all at once... as you can see on the timestamps above,
they are not very frequent when under heavy load (in the log snippet
above, the system was doing a big workload over several days). For the
Ethernet devices that are passed into the VMs, I noticed that they
experience transmit hangs / resets in the virtual machines, and when
these occur, they correspond to a matching IO_PAGE_FAULT that belongs
to that PCI device.

FWIW, those NIC hangs look like this (visible in the VM guest OS):
[17879.279091] NETDEV WATCHDOG: s1p1 (bnxt_en): transmit queue 2 timed out
[17879.279111] WARNING: CPU: 5 PID: 0 at net/sched/sch_generic.c:447
dev_watchdog+0x121/0x17e
...
[17879.279213] bnxt_en 0000:01:09.0 s1p1: TX timeout detected,
starting reset task!
[17883.075299] bnxt_en 0000:01:09.0 s1p1: Resp cmpl intr err msg: 0x51
[17883.075302] bnxt_en 0000:01:09.0 s1p1: hwrm_ring_free type 1
failed. rc:fffffff0 err:0
[17886.957100] bnxt_en 0000:01:09.0 s1p1: Resp cmpl intr err msg: 0x51
[17886.957103] bnxt_en 0000:01:09.0 s1p1: hwrm_ring_free type 2
failed. rc:fffffff0 err:0
[17890.843023] bnxt_en 0000:01:09.0 s1p1: Resp cmpl intr err msg: 0x51
[17890.843025] bnxt_en 0000:01:09.0 s1p1: hwrm_ring_free type 2
failed. rc:fffffff0 err:0

We see these NIC hangs in the VMs occur with both Broadcom and
Mellanox Ethernet adapters that are passed into the VMs, so I don't
think it's the NICs causing the IO_PAGE_FAULT events observed in the
hypervisor. Plus we see IO_PAGE_FAULT's for devices other than
Ethernet adapters.


I have several of these same servers (all using the same motherboard,
processor, memory, BIOS, etc.) and they all experience this behavior
with the IO_PAGE_FAULT events, so I don't believe it to be any one
faulty server / component. I guess my question is I'm not sure where
to dig/push next. Is this perhaps an issue with the BIOS/firmware on
these motherboards? Something with the chipset (AMD IOMMU)? A
colleague has suggested that even the AGESA may be related. Or should
I be focusing on the Linux kernel, the AMD IOMMU driver (software)?

I've been poking around other similar bug reports, and I see the
IO_PAGE_FAULT and NIC reset / transmit hang seem to be related in
other posts. This commit looked promising:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4e50ce03976fbc8ae995a000c4b10c737467beaa

But I see RH has already back-ported it into their
3.10.0-1127.18.2.el7 kernel source. I'm open to trying a newer Linux
vanilla kernel (eg, 5.4.x) but would prefer to resolve this in the
RHEL kernel I'm using now. I'll take a look at this next, although due
to the complex nature of this hypervisor/VM setup, it's a bit tedious
to test.


Kernel messages from boot (using the amd_iommu_dump=1 parameter):
...
[    0.214395] AMD-Vi: Using IVHD type 0x11
[    0.214627] AMD-Vi: device: c0:00.2 cap: 0040 seg: 0 flags: b0 info 0000
[    0.214628] AMD-Vi:        mmio-addr: 00000000f3700000
[    0.214634] AMD-Vi:   DEV_SELECT_RANGE_START  devid: c0:01.0 flags: 00
[    0.214635] AMD-Vi:   DEV_RANGE_END           devid: ff:1f.6
[    0.214763] AMD-Vi:   DEV_SPECIAL(IOAPIC[241])               devid: c0:00.1
[    0.214765] AMD-Vi: device: 80:00.2 cap: 0040 seg: 0 flags: b0 info 0000
[    0.214766] AMD-Vi:        mmio-addr: 00000000f2600000
[    0.214771] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 80:01.0 flags: 00
[    0.214772] AMD-Vi:   DEV_RANGE_END           devid: bf:1f.6
[    0.214900] AMD-Vi:   DEV_SPECIAL(IOAPIC[242])               devid: 80:00.1
[    0.214901] AMD-Vi: device: 40:00.2 cap: 0040 seg: 0 flags: b0 info 0000
[    0.214902] AMD-Vi:        mmio-addr: 00000000b4800000
[    0.214906] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 40:01.0 flags: 00
[    0.214907] AMD-Vi:   DEV_RANGE_END           devid: 7f:1f.6
[    0.215036] AMD-Vi:   DEV_SPECIAL(IOAPIC[243])               devid: 40:00.1
[    0.215037] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: b0 info 0000
[    0.215038] AMD-Vi:        mmio-addr: 00000000fc800000
[    0.215044] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:01.0 flags: 00
[    0.215045] AMD-Vi:   DEV_RANGE_END           devid: 3f:1f.6
[    0.215173] AMD-Vi:   DEV_ALIAS_RANGE                 devid:
ff:00.0 flags: 00 devid_to: 00:14.4
[    0.215174] AMD-Vi:   DEV_RANGE_END           devid: ff:1f.7
[    0.215179] AMD-Vi:   DEV_SPECIAL(HPET[0])           devid: 00:14.0
[    0.215180] AMD-Vi:   DEV_SPECIAL(IOAPIC[240])               devid: 00:14.0
[    0.215181] AMD-Vi:   DEV_SPECIAL(IOAPIC[244])               devid: 00:00.1
...
[    4.345723] AMD-Vi: Found IOMMU at 0000:c0:00.2 cap 0x40
[    4.345724] AMD-Vi: Extended features (0x58f77ef22294ade):
[    4.345724]  PPR X2APIC NX GT IA GA PC GA_vAPIC
[    4.345728] AMD-Vi: Found IOMMU at 0000:80:00.2 cap 0x40
[    4.345729] AMD-Vi: Extended features (0x58f77ef22294ade):
[    4.345729]  PPR X2APIC NX GT IA GA PC GA_vAPIC
[    4.345731] AMD-Vi: Found IOMMU at 0000:40:00.2 cap 0x40
[    4.345732] AMD-Vi: Extended features (0x58f77ef22294ade):
[    4.345733]  PPR X2APIC NX GT IA GA PC GA_vAPIC
[    4.345735] AMD-Vi: Found IOMMU at 0000:00:00.2 cap 0x40
[    4.345735] AMD-Vi: Extended features (0x58f77ef22294ade):
[    4.345736]  PPR X2APIC NX GT IA GA PC GA_vAPIC
[    4.345737] AMD-Vi: Interrupt remapping enabled
[    4.345738] AMD-Vi: virtual APIC enabled
[    4.345739] AMD-Vi: X2APIC enabled
[    4.345805] pci 0000:c0:00.2: irq 26 for MSI/MSI-X
[    4.345947] pci 0000:80:00.2: irq 27 for MSI/MSI-X
[    4.346073] pci 0000:40:00.2: irq 28 for MSI/MSI-X
[    4.346208] pci 0000:00:00.2: irq 29 for MSI/MSI-X
[    4.346305] AMD-Vi: IO/TLB flush on unmap enabled
...

I have also tried using 'amd_iommu=fullflush' (as denoted in the
kernel message above) on a hunch after reviewing other user's posts
with similar IO_PAGE_FAULT events, but this doesn't seem to change
anything -- the events still occur with or without this kernel
parameter.

So, any guidance/tips/advice on how to tackle this would be greatly
appreciated. Thank you for your consideration and time!


--Marc
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: AMD-Vi: Event logged [IO_PAGE_FAULT device=42:00.0 domain=0x005e address=0xfffffffdf8030000 flags=0x0008]
  2020-12-03  6:18 AMD-Vi: Event logged [IO_PAGE_FAULT device=42:00.0 domain=0x005e address=0xfffffffdf8030000 flags=0x0008] Marc Smith
@ 2020-12-04 15:10 ` Marc Smith
  0 siblings, 0 replies; 2+ messages in thread
From: Marc Smith @ 2020-12-04 15:10 UTC (permalink / raw)
  To: iommu

On Thu, Dec 3, 2020 at 1:18 AM Marc Smith <msmith626@gmail.com> wrote:
>
> Hi,
>
> First, I must preface this email by apologizing in advance for asking
> about a distro kernel (RHEL in this case); so not truly reporting this
> problem and requesting a fix here (I know this should be taken up with
> the vendor), rather hoping someone can give me a few hints/pointers on
> where to look next for debugging this issue.
>
> I'm using RHEL 7.8.2003 (CentOS) with a 3.10.0-1127.18.2.el7 kernel.
> The systems use a Supermicro H12SSW-NT board (AMD), and we have the
> IOMMU enabled along with SR-IOV. I have several virtual machines (QEMU
> KVM) that run on these servers, and I'm passing PCIe end-points into
> the VMs (in some cases the whole PCIe EP itself, and for some devices
> I use SR-IOV and pass in the VFs to the VMs). The VM's run Linux as
> their guest OS (a couple different distros).
>
> While the servers (VMs) are idle, I don't experience any problems. But
> when I start doing a lot of I/O in the virtual machines (iSCSI across
> Ethernet interfaces, disk I/O via SAS HBAs that are passed into the
> VM, etc.) I notice the following after some time at the host layer
> ("hypervisor"):
> Nov 29 10:50:00 node1 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT
> device=42:00.0 domain=0x005e address=0xfffffffdf8030000 flags=0x0008]
> Nov 29 22:02:03 node1 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT
> device=c8:02.1 domain=0x005f address=0xfffffffdf8060000 flags=0x0008]
> Nov 30 02:13:54 node1 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT
> device=42:00.0 domain=0x005e address=0xfffffffdf8020000 flags=0x0008]
> Nov 30 02:28:44 node1 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT
> device=c8:02.0 domain=0x005e address=0xfffffffdf8020000 flags=0x0008]
> Nov 30 10:48:53 node1 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT
> device=01:00.0 domain=0x005e address=0xfffffffdf8040000 flags=0x0008]
> Dec  2 07:05:22 node1 kernel: AMD-Vi: Event logged [IO_PAGE_FAULT
> device=c8:03.0 domain=0x005e address=0xfffffffdf8010000 flags=0x0008]
>
> These events happen to all PCIe devices that are passed into the VMs,
> although not all at once... as you can see on the timestamps above,
> they are not very frequent when under heavy load (in the log snippet
> above, the system was doing a big workload over several days). For the
> Ethernet devices that are passed into the VMs, I noticed that they
> experience transmit hangs / resets in the virtual machines, and when
> these occur, they correspond to a matching IO_PAGE_FAULT that belongs
> to that PCI device.
>
> FWIW, those NIC hangs look like this (visible in the VM guest OS):
> [17879.279091] NETDEV WATCHDOG: s1p1 (bnxt_en): transmit queue 2 timed out
> [17879.279111] WARNING: CPU: 5 PID: 0 at net/sched/sch_generic.c:447
> dev_watchdog+0x121/0x17e
> ...
> [17879.279213] bnxt_en 0000:01:09.0 s1p1: TX timeout detected,
> starting reset task!
> [17883.075299] bnxt_en 0000:01:09.0 s1p1: Resp cmpl intr err msg: 0x51
> [17883.075302] bnxt_en 0000:01:09.0 s1p1: hwrm_ring_free type 1
> failed. rc:fffffff0 err:0
> [17886.957100] bnxt_en 0000:01:09.0 s1p1: Resp cmpl intr err msg: 0x51
> [17886.957103] bnxt_en 0000:01:09.0 s1p1: hwrm_ring_free type 2
> failed. rc:fffffff0 err:0
> [17890.843023] bnxt_en 0000:01:09.0 s1p1: Resp cmpl intr err msg: 0x51
> [17890.843025] bnxt_en 0000:01:09.0 s1p1: hwrm_ring_free type 2
> failed. rc:fffffff0 err:0
>
> We see these NIC hangs in the VMs occur with both Broadcom and
> Mellanox Ethernet adapters that are passed into the VMs, so I don't
> think it's the NICs causing the IO_PAGE_FAULT events observed in the
> hypervisor. Plus we see IO_PAGE_FAULT's for devices other than
> Ethernet adapters.
>
>
> I have several of these same servers (all using the same motherboard,
> processor, memory, BIOS, etc.) and they all experience this behavior
> with the IO_PAGE_FAULT events, so I don't believe it to be any one
> faulty server / component. I guess my question is I'm not sure where
> to dig/push next. Is this perhaps an issue with the BIOS/firmware on
> these motherboards? Something with the chipset (AMD IOMMU)? A
> colleague has suggested that even the AGESA may be related. Or should
> I be focusing on the Linux kernel, the AMD IOMMU driver (software)?
>
> I've been poking around other similar bug reports, and I see the
> IO_PAGE_FAULT and NIC reset / transmit hang seem to be related in
> other posts. This commit looked promising:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4e50ce03976fbc8ae995a000c4b10c737467beaa
>
> But I see RH has already back-ported it into their
> 3.10.0-1127.18.2.el7 kernel source. I'm open to trying a newer Linux
> vanilla kernel (eg, 5.4.x) but would prefer to resolve this in the
> RHEL kernel I'm using now. I'll take a look at this next, although due
> to the complex nature of this hypervisor/VM setup, it's a bit tedious
> to test.
>
>
> Kernel messages from boot (using the amd_iommu_dump=1 parameter):
> ...
> [    0.214395] AMD-Vi: Using IVHD type 0x11
> [    0.214627] AMD-Vi: device: c0:00.2 cap: 0040 seg: 0 flags: b0 info 0000
> [    0.214628] AMD-Vi:        mmio-addr: 00000000f3700000
> [    0.214634] AMD-Vi:   DEV_SELECT_RANGE_START  devid: c0:01.0 flags: 00
> [    0.214635] AMD-Vi:   DEV_RANGE_END           devid: ff:1f.6
> [    0.214763] AMD-Vi:   DEV_SPECIAL(IOAPIC[241])               devid: c0:00.1
> [    0.214765] AMD-Vi: device: 80:00.2 cap: 0040 seg: 0 flags: b0 info 0000
> [    0.214766] AMD-Vi:        mmio-addr: 00000000f2600000
> [    0.214771] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 80:01.0 flags: 00
> [    0.214772] AMD-Vi:   DEV_RANGE_END           devid: bf:1f.6
> [    0.214900] AMD-Vi:   DEV_SPECIAL(IOAPIC[242])               devid: 80:00.1
> [    0.214901] AMD-Vi: device: 40:00.2 cap: 0040 seg: 0 flags: b0 info 0000
> [    0.214902] AMD-Vi:        mmio-addr: 00000000b4800000
> [    0.214906] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 40:01.0 flags: 00
> [    0.214907] AMD-Vi:   DEV_RANGE_END           devid: 7f:1f.6
> [    0.215036] AMD-Vi:   DEV_SPECIAL(IOAPIC[243])               devid: 40:00.1
> [    0.215037] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: b0 info 0000
> [    0.215038] AMD-Vi:        mmio-addr: 00000000fc800000
> [    0.215044] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:01.0 flags: 00
> [    0.215045] AMD-Vi:   DEV_RANGE_END           devid: 3f:1f.6
> [    0.215173] AMD-Vi:   DEV_ALIAS_RANGE                 devid:
> ff:00.0 flags: 00 devid_to: 00:14.4
> [    0.215174] AMD-Vi:   DEV_RANGE_END           devid: ff:1f.7
> [    0.215179] AMD-Vi:   DEV_SPECIAL(HPET[0])           devid: 00:14.0
> [    0.215180] AMD-Vi:   DEV_SPECIAL(IOAPIC[240])               devid: 00:14.0
> [    0.215181] AMD-Vi:   DEV_SPECIAL(IOAPIC[244])               devid: 00:00.1
> ...
> [    4.345723] AMD-Vi: Found IOMMU at 0000:c0:00.2 cap 0x40
> [    4.345724] AMD-Vi: Extended features (0x58f77ef22294ade):
> [    4.345724]  PPR X2APIC NX GT IA GA PC GA_vAPIC
> [    4.345728] AMD-Vi: Found IOMMU at 0000:80:00.2 cap 0x40
> [    4.345729] AMD-Vi: Extended features (0x58f77ef22294ade):
> [    4.345729]  PPR X2APIC NX GT IA GA PC GA_vAPIC
> [    4.345731] AMD-Vi: Found IOMMU at 0000:40:00.2 cap 0x40
> [    4.345732] AMD-Vi: Extended features (0x58f77ef22294ade):
> [    4.345733]  PPR X2APIC NX GT IA GA PC GA_vAPIC
> [    4.345735] AMD-Vi: Found IOMMU at 0000:00:00.2 cap 0x40
> [    4.345735] AMD-Vi: Extended features (0x58f77ef22294ade):
> [    4.345736]  PPR X2APIC NX GT IA GA PC GA_vAPIC
> [    4.345737] AMD-Vi: Interrupt remapping enabled
> [    4.345738] AMD-Vi: virtual APIC enabled
> [    4.345739] AMD-Vi: X2APIC enabled
> [    4.345805] pci 0000:c0:00.2: irq 26 for MSI/MSI-X
> [    4.345947] pci 0000:80:00.2: irq 27 for MSI/MSI-X
> [    4.346073] pci 0000:40:00.2: irq 28 for MSI/MSI-X
> [    4.346208] pci 0000:00:00.2: irq 29 for MSI/MSI-X
> [    4.346305] AMD-Vi: IO/TLB flush on unmap enabled
> ...
>
> I have also tried using 'amd_iommu=fullflush' (as denoted in the
> kernel message above) on a hunch after reviewing other user's posts
> with similar IO_PAGE_FAULT events, but this doesn't seem to change
> anything -- the events still occur with or without this kernel
> parameter.
>
> So, any guidance/tips/advice on how to tackle this would be greatly
> appreciated. Thank you for your consideration and time!

I booted the systems with "amd_iommu_intr=legacy" and the problem went
away! No more IO_PAGE_FAULT's in the hypervisor, and no NIC
hangs/resets in the virtual machines! No noticeable degradation of I/O
performance either. Confirmed on two systems.

--Marc

>
>
> --Marc
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-12-04 15:10 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-03  6:18 AMD-Vi: Event logged [IO_PAGE_FAULT device=42:00.0 domain=0x005e address=0xfffffffdf8030000 flags=0x0008] Marc Smith
2020-12-04 15:10 ` Marc Smith

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).