[PATCH v4 0/3] Fix MCE handling on AMD hosts

* [PATCH v4 0/3] Fix MCE handling on AMD hosts
@ 2023-09-12 21:18 John Allen
  2023-09-12 21:18 ` [PATCH v4 1/3] i386: Fix MCE support for " John Allen
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: John Allen @ 2023-09-12 21:18 UTC (permalink / raw)
  To: qemu-devel
  Cc: yazen.ghannam, michael.roth, babu.moger, william.roche,
	joao.m.martins, pbonzini, richard.henderson, eduardo, John Allen

In the event that a guest process attempts to access memory that has
been poisoned in response to a deferred uncorrected MCE, an AMD system
will currently generate a SIGBUS error which will result in the entire
guest being shutdown. Ideally, we only want to kill the guest process
that accessed poisoned memory in this case.

This support has been included in qemu for Intel hosts for a long time,
but there are a couple of changes needed for AMD hosts. First, we will
need to expose the SUCCOR cpuid bit to guests. Second, we need to modify
the MCE injection code to avoid Intel specific behavior when we are
running on an AMD host.

v2:
  - Add "succor" feature word.
  - Add case to kvm_arch_get_supported_cpuid for the SUCCOR feature.

v3:
  - Reorder series. Only enable SUCCOR after bugs have been fixed.
  - Introduce new patch ignoring AO errors.

v4:
  - Remove redundant check for AO errors.

John Allen (2):
  i386: Fix MCE support for AMD hosts
  i386: Add support for SUCCOR feature

William Roche (1):
  i386: Explicitly ignore unsupported BUS_MCEERR_AO MCE on AMD guest

 target/i386/cpu.c     | 18 +++++++++++++++++-
 target/i386/cpu.h     |  4 ++++
 target/i386/helper.c  |  4 ++++
 target/i386/kvm/kvm.c | 28 ++++++++++++++++++++--------
 4 files changed, 45 insertions(+), 9 deletions(-)

-- 
2.39.3

^ permalink raw reply	[flat|nested] 15+ messages in thread