From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kiszka Subject: Re: [PATCH 14/22] kvm: Fix race between timer signals and vcpu entry under !IOTHREAD Date: Tue, 01 Feb 2011 15:37:26 +0100 Message-ID: <4D481AA6.9010607@siemens.com> References: <20110201124707.GA12061@amt.cnet> <4D480B76.90509@siemens.com> <20110201134821.GA12848@amt.cnet> <4D48116A.9020807@siemens.com> <20110201141039.GA14442@amt.cnet> <4D4816F0.5060009@siemens.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Avi Kivity , "kvm@vger.kernel.org" , "qemu-devel@nongnu.org" , Stefan Hajnoczi To: Marcelo Tosatti Return-path: Received: from thoth.sbs.de ([192.35.17.2]:28438 "EHLO thoth.sbs.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753267Ab1BAOho (ORCPT ); Tue, 1 Feb 2011 09:37:44 -0500 In-Reply-To: <4D4816F0.5060009@siemens.com> Sender: kvm-owner@vger.kernel.org List-ID: On 2011-02-01 15:21, Jan Kiszka wrote: > On 2011-02-01 15:10, Marcelo Tosatti wrote: >> On Tue, Feb 01, 2011 at 02:58:02PM +0100, Jan Kiszka wrote: >>> On 2011-02-01 14:48, Marcelo Tosatti wrote: >>>> On Tue, Feb 01, 2011 at 02:32:38PM +0100, Jan Kiszka wrote: >>>>> On 2011-02-01 13:47, Marcelo Tosatti wrote: >>>>>> On Thu, Jan 27, 2011 at 02:09:58PM +0100, Jan Kiszka wrote: >>>>>>> Found by Stefan Hajnoczi: There is a race in kvm_cpu_exec between >>>>>>> checking for exit_request on vcpu entry and timer signals arriving >>>>>>> before KVM starts to catch them. Plug it by blocking both timer related >>>>>>> signals also on !CONFIG_IOTHREAD and process those via signalfd. >>>>>>> >>>>>>> Signed-off-by: Jan Kiszka >>>>>>> CC: Stefan Hajnoczi >>>>>>> --- >>>>>>> cpus.c | 18 ++++++++++++++++++ >>>>>>> 1 files changed, 18 insertions(+), 0 deletions(-) >>>>>>> >>>>>>> diff --git a/cpus.c b/cpus.c >>>>>>> index fc3f222..29b1070 100644 >>>>>>> --- a/cpus.c >>>>>>> +++ b/cpus.c >>>>>>> @@ -254,6 +254,10 @@ static void qemu_kvm_init_cpu_signals(CPUState *env) >>>>>>> pthread_sigmask(SIG_BLOCK, NULL, &set); >>>>>>> sigdelset(&set, SIG_IPI); >>>>>>> sigdelset(&set, SIGBUS); >>>>>>> +#ifndef CONFIG_IOTHREAD >>>>>>> + sigdelset(&set, SIGIO); >>>>>>> + sigdelset(&set, SIGALRM); >>>>>>> +#endif >>>>>> >>>>>> I'd prefer separate qemu_kvm_init_cpu_signals in the !IOTHREAD >>>>>> section. >>>>> >>>>> You mean to duplicate qemu_kvm_init_cpu_signals for both configurations? >>>> >>>> Yes, so to avoid #ifdefs spread. >>> >>> Would exchange some #ifdefs against ifndef _WIN32. Haven't measured the >>> delta though. >>> >>>> >>>>>>> + >>>>>>> +#ifndef CONFIG_IOTHREAD >>>>>>> + if (sigismember(&chkset, SIGIO) || sigismember(&chkset, SIGALRM)) { >>>>>>> + qemu_notify_event(); >>>>>>> + } >>>>>>> +#endif >>>>>> >>>>>> Why is this necessary? >>>>>> >>>>>> You should break out of cpu_exec_all if there's a pending alarm (see >>>>>> qemu_alarm_pending()). >>>>> >>>>> qemu_alarm_pending() is not true until the signal is actually taken. The >>>>> alarm handler sets the required flags. >>>> >>>> Right. What i mean is you need to execute the signal handler inside >>>> cpu_exec_all loop (so that alarm pending is set). >>>> >>>> So, if there is a SIGALRM pending, qemu_run_timers has highest >>>> priority, not vcpu execution. >>> >>> We leave the vcpu loop (thanks to notify_event), process the signal in >>> the event loop and run the timer handler. This pattern is IMO less >>> invasive to the existing code, specifically as it is about to die >>> long-term anyway. >> >> You'll probably see poor timer behaviour on smp guests without iothread >> enabled. >> > > Still checking, but that would mean the notification mechanism is broken > anyway: If IO events do not force us to process them quickly, we already > suffer from latencies in SMP mode. Maybe a regression caused by the iothread introduction: we need to break out of the cpu loop via global exit_request when there is an IO event pending. Fixing this will also heal the above issue. Sigh, we need to get rid of those two implementations and focus all reviewing and testing on one. I bet there are still more corner cases sleeping somewhere. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=57738 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PkHMN-0001JW-1Y for qemu-devel@nongnu.org; Tue, 01 Feb 2011 09:37:37 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PkHMK-00007D-PN for qemu-devel@nongnu.org; Tue, 01 Feb 2011 09:37:34 -0500 Received: from thoth.sbs.de ([192.35.17.2]:25932) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PkHMK-00006g-Gs for qemu-devel@nongnu.org; Tue, 01 Feb 2011 09:37:32 -0500 Message-ID: <4D481AA6.9010607@siemens.com> Date: Tue, 01 Feb 2011 15:37:26 +0100 From: Jan Kiszka MIME-Version: 1.0 References: <20110201124707.GA12061@amt.cnet> <4D480B76.90509@siemens.com> <20110201134821.GA12848@amt.cnet> <4D48116A.9020807@siemens.com> <20110201141039.GA14442@amt.cnet> <4D4816F0.5060009@siemens.com> In-Reply-To: <4D4816F0.5060009@siemens.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] Re: [PATCH 14/22] kvm: Fix race between timer signals and vcpu entry under !IOTHREAD List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Marcelo Tosatti Cc: Stefan Hajnoczi , Avi Kivity , "kvm@vger.kernel.org" , "qemu-devel@nongnu.org" On 2011-02-01 15:21, Jan Kiszka wrote: > On 2011-02-01 15:10, Marcelo Tosatti wrote: >> On Tue, Feb 01, 2011 at 02:58:02PM +0100, Jan Kiszka wrote: >>> On 2011-02-01 14:48, Marcelo Tosatti wrote: >>>> On Tue, Feb 01, 2011 at 02:32:38PM +0100, Jan Kiszka wrote: >>>>> On 2011-02-01 13:47, Marcelo Tosatti wrote: >>>>>> On Thu, Jan 27, 2011 at 02:09:58PM +0100, Jan Kiszka wrote: >>>>>>> Found by Stefan Hajnoczi: There is a race in kvm_cpu_exec between >>>>>>> checking for exit_request on vcpu entry and timer signals arriving >>>>>>> before KVM starts to catch them. Plug it by blocking both timer related >>>>>>> signals also on !CONFIG_IOTHREAD and process those via signalfd. >>>>>>> >>>>>>> Signed-off-by: Jan Kiszka >>>>>>> CC: Stefan Hajnoczi >>>>>>> --- >>>>>>> cpus.c | 18 ++++++++++++++++++ >>>>>>> 1 files changed, 18 insertions(+), 0 deletions(-) >>>>>>> >>>>>>> diff --git a/cpus.c b/cpus.c >>>>>>> index fc3f222..29b1070 100644 >>>>>>> --- a/cpus.c >>>>>>> +++ b/cpus.c >>>>>>> @@ -254,6 +254,10 @@ static void qemu_kvm_init_cpu_signals(CPUState *env) >>>>>>> pthread_sigmask(SIG_BLOCK, NULL, &set); >>>>>>> sigdelset(&set, SIG_IPI); >>>>>>> sigdelset(&set, SIGBUS); >>>>>>> +#ifndef CONFIG_IOTHREAD >>>>>>> + sigdelset(&set, SIGIO); >>>>>>> + sigdelset(&set, SIGALRM); >>>>>>> +#endif >>>>>> >>>>>> I'd prefer separate qemu_kvm_init_cpu_signals in the !IOTHREAD >>>>>> section. >>>>> >>>>> You mean to duplicate qemu_kvm_init_cpu_signals for both configurations? >>>> >>>> Yes, so to avoid #ifdefs spread. >>> >>> Would exchange some #ifdefs against ifndef _WIN32. Haven't measured the >>> delta though. >>> >>>> >>>>>>> + >>>>>>> +#ifndef CONFIG_IOTHREAD >>>>>>> + if (sigismember(&chkset, SIGIO) || sigismember(&chkset, SIGALRM)) { >>>>>>> + qemu_notify_event(); >>>>>>> + } >>>>>>> +#endif >>>>>> >>>>>> Why is this necessary? >>>>>> >>>>>> You should break out of cpu_exec_all if there's a pending alarm (see >>>>>> qemu_alarm_pending()). >>>>> >>>>> qemu_alarm_pending() is not true until the signal is actually taken. The >>>>> alarm handler sets the required flags. >>>> >>>> Right. What i mean is you need to execute the signal handler inside >>>> cpu_exec_all loop (so that alarm pending is set). >>>> >>>> So, if there is a SIGALRM pending, qemu_run_timers has highest >>>> priority, not vcpu execution. >>> >>> We leave the vcpu loop (thanks to notify_event), process the signal in >>> the event loop and run the timer handler. This pattern is IMO less >>> invasive to the existing code, specifically as it is about to die >>> long-term anyway. >> >> You'll probably see poor timer behaviour on smp guests without iothread >> enabled. >> > > Still checking, but that would mean the notification mechanism is broken > anyway: If IO events do not force us to process them quickly, we already > suffer from latencies in SMP mode. Maybe a regression caused by the iothread introduction: we need to break out of the cpu loop via global exit_request when there is an IO event pending. Fixing this will also heal the above issue. Sigh, we need to get rid of those two implementations and focus all reviewing and testing on one. I bet there are still more corner cases sleeping somewhere. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux