From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [PATCH -v4] QEMU-KVM: MCE: Relay UCR MCE to guest Date: Mon, 21 Sep 2009 13:08:38 +0300 Message-ID: <4AB750A6.1090000@redhat.com> References: <1253501005.15717.548.camel@yhuang-dev.sh.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Marcelo Tosatti , Andi Kleen , Anthony Liguori , "kvm@vger.kernel.org" To: Huang Ying Return-path: Received: from mx1.redhat.com ([209.132.183.28]:45900 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755891AbZIUKIq (ORCPT ); Mon, 21 Sep 2009 06:08:46 -0400 In-Reply-To: <1253501005.15717.548.camel@yhuang-dev.sh.intel.com> Sender: kvm-owner@vger.kernel.org List-ID: On 09/21/2009 05:43 AM, Huang Ying wrote: > UCR (uncorrected recovery) MCE is supported in recent Intel CPUs, > where some hardware error such as some memory error can be reported > without PCC (processor context corrupted). To recover from such MCE, > the corresponding memory will be unmapped, and all processes accessing > the memory will be killed via SIGBUS. > > For KVM, if QEMU/KVM is killed, all guest processes will be killed > too. So we relay SIGBUS from host OS to guest system via a UCR MCE > injection. Then guest OS can isolate corresponding memory and kill > necessary guest processes only. SIGBUS sent to main thread (not VCPU > threads) will be broadcast to all VCPU threads as UCR MCE. > > > > --- a/qemu-kvm.c > +++ b/qemu-kvm.c > @@ -27,10 +27,23 @@ > #include > #include > #include > +#include > This causes a build failure, since not all hosts have , but more importantly: > + > +static void sigbus_handler(int n, struct signalfd_siginfo *siginfo, void *ctx) > +{ > Here you accept signalfd_siginfo, while > + > + memset(&action, 0, sizeof(action)); > + action.sa_flags = SA_SIGINFO; > + action.sa_sigaction = (void (*)(int, siginfo_t*, void*))sigbus_handler; > + sigaction(SIGBUS,&action, NULL); > + prctl(PR_MCE_KILL, 1, 1); > return 0; > here you arm the function with something that will send it a siginfo_t. So it looks like this is broken if a signal is ever received directly? But can this happen due to signalfd? > } > > @@ -1962,7 +2116,10 @@ static void sigfd_handler(void *opaque) > } > > sigaction(info.ssi_signo, NULL,&action); > - if (action.sa_handler) > + if ((action.sa_flags& SA_SIGINFO)&& action.sa_sigaction) > + action.sa_sigaction(info.ssi_signo, > + (siginfo_t *)&info, NULL); > + else if (action.sa_handler) > action.sa_handler(info.ssi_signo); > The whole "extract handler from sigaction and call it" was a hack. -- error compiling committee.c: too many arguments to function