From mboxrd@z Thu Jan 1 00:00:00 1970 From: Huang Ying Subject: Re: [PATCH -v4] QEMU-KVM: MCE: Relay UCR MCE to guest Date: Tue, 22 Sep 2009 09:12:54 +0800 Message-ID: <1253581974.15717.726.camel@yhuang-dev.sh.intel.com> References: <1253501005.15717.548.camel@yhuang-dev.sh.intel.com> <4AB750A6.1090000@redhat.com> Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit Cc: Marcelo Tosatti , Andi Kleen , Anthony Liguori , "kvm@vger.kernel.org" To: Avi Kivity Return-path: Received: from mga09.intel.com ([134.134.136.24]:44619 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750783AbZIVBMw (ORCPT ); Mon, 21 Sep 2009 21:12:52 -0400 In-Reply-To: <4AB750A6.1090000@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On Mon, 2009-09-21 at 18:08 +0800, Avi Kivity wrote: > On 09/21/2009 05:43 AM, Huang Ying wrote: > > UCR (uncorrected recovery) MCE is supported in recent Intel CPUs, > > where some hardware error such as some memory error can be reported > > without PCC (processor context corrupted). To recover from such MCE, > > the corresponding memory will be unmapped, and all processes accessing > > the memory will be killed via SIGBUS. > > > > For KVM, if QEMU/KVM is killed, all guest processes will be killed > > too. So we relay SIGBUS from host OS to guest system via a UCR MCE > > injection. Then guest OS can isolate corresponding memory and kill > > necessary guest processes only. SIGBUS sent to main thread (not VCPU > > threads) will be broadcast to all VCPU threads as UCR MCE. > > > > > > > > --- a/qemu-kvm.c > > +++ b/qemu-kvm.c > > @@ -27,10 +27,23 @@ > > #include > > #include > > #include > > +#include > > > > This causes a build failure, since not all hosts have , > but more importantly: Maybe we can just add necessary fields to struct qemu_signalfd_siginfo. But this may be not portable. > > + > > +static void sigbus_handler(int n, struct signalfd_siginfo *siginfo, void *ctx) > > +{ > > > > Here you accept signalfd_siginfo, while > > > + > > + memset(&action, 0, sizeof(action)); > > + action.sa_flags = SA_SIGINFO; > > + action.sa_sigaction = (void (*)(int, siginfo_t*, void*))sigbus_handler; > > + sigaction(SIGBUS,&action, NULL); > > + prctl(PR_MCE_KILL, 1, 1); > > return 0; > > > > here you arm the function with something that will send it a siginfo_t. > So it looks like this is broken if a signal is ever received directly? > But can this happen due to signalfd? Because SIGBUS is blocked, I think the signal handler will not be called directly, but from sigfd_handler. > > } > > > > @@ -1962,7 +2116,10 @@ static void sigfd_handler(void *opaque) > > } > > > > sigaction(info.ssi_signo, NULL,&action); > > - if (action.sa_handler) > > + if ((action.sa_flags& SA_SIGINFO)&& action.sa_sigaction) > > + action.sa_sigaction(info.ssi_signo, > > + (siginfo_t *)&info, NULL); > > + else if (action.sa_handler) > > action.sa_handler(info.ssi_signo); > > > > The whole "extract handler from sigaction and call it" was a hack. The "hack" above (signalfd_siginfo vs siginfo_t) is for "extract handler from sigaction and call it" too. So I suggest to replace it with calling handler directly. Best Regards, Huang Ying