From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A589CC64E7B for ; Tue, 1 Dec 2020 22:11:00 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1322620757 for ; Tue, 1 Dec 2020 22:10:59 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1322620757 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=csgraf.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:47902 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kkDrO-0006um-PL for qemu-devel@archiver.kernel.org; Tue, 01 Dec 2020 17:10:58 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:60030) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kkDqN-0005sD-70; Tue, 01 Dec 2020 17:09:55 -0500 Received: from mail.csgraf.de ([188.138.100.120]:50348 helo=zulu616.server4you.de) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kkDqG-00041U-Op; Tue, 01 Dec 2020 17:09:54 -0500 Received: from freeip.amazon.com (ec2-3-122-114-9.eu-central-1.compute.amazonaws.com [3.122.114.9]) by csgraf.de (Postfix) with UTF8SMTPSA id 5CA4F3900313; Tue, 1 Dec 2020 23:09:45 +0100 (CET) Subject: Re: [PATCH] arm/hvf: Optimize and simplify WFI handling To: Peter Collingbourne References: <20201201082142.649007-1-pcc@google.com> <5b691ccb-43bb-5955-d47a-cae39c59522c@csgraf.de> From: Alexander Graf Message-ID: <8cc9052b-da85-de93-9d54-d4d0730054ec@csgraf.de> Date: Tue, 1 Dec 2020 23:09:44 +0100 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.16; rv:84.0) Gecko/20100101 Thunderbird/84.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Received-SPF: pass client-ip=188.138.100.120; envelope-from=agraf@csgraf.de; helo=zulu616.server4you.de X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, NICE_REPLY_A=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Eduardo Habkost , Richard Henderson , qemu-devel , Cameron Esfahani , Roman Bolshakov , qemu-arm@nongnu.org, Claudio Fontana , Frank Yang , Paolo Bonzini Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" On 01.12.20 21:03, Peter Collingbourne wrote: > On Tue, Dec 1, 2020 at 8:26 AM Alexander Graf wrote: >> >> On 01.12.20 09:21, Peter Collingbourne wrote: >>> Sleep on WFx until the VTIMER is due but allow ourselves to be woken >>> up on IPI. >>> >>> Signed-off-by: Peter Collingbourne >>> --- >>> Alexander Graf wrote: >>>> I would love to take a patch from you here :). I'll still be stuck for a >>>> while with the sysreg sync rework that Peter asked for before I can look >>>> at WFI again. >>> Okay, here's a patch :) It's a relatively straightforward adaptation >>> of what we have in our fork, which can now boot Android to GUI while >>> remaining at around 4% CPU when idle. >>> >>> I'm not set up to boot a full Linux distribution at the moment so I >>> tested it on upstream QEMU by running a recent mainline Linux kernel >>> with a rootfs containing an init program that just does sleep(5) >>> and verified that the qemu process remains at low CPU usage during >>> the sleep. This was on top of your v2 plus the last patch of your v1 >>> since it doesn't look like you have a replacement for that logic yet. >> >> How about something like this instead? >> >> >> Alex >> >> >> diff --git a/accel/hvf/hvf-cpus.c b/accel/hvf/hvf-cpus.c >> index 4360f64671..50384013ea 100644 >> --- a/accel/hvf/hvf-cpus.c >> +++ b/accel/hvf/hvf-cpus.c >> @@ -337,16 +337,18 @@ static int hvf_init_vcpu(CPUState *cpu) >> cpu->hvf = g_malloc0(sizeof(*cpu->hvf)); >> >> /* init cpu signals */ >> - sigset_t set; >> struct sigaction sigact; >> >> memset(&sigact, 0, sizeof(sigact)); >> sigact.sa_handler = dummy_signal; >> sigaction(SIG_IPI, &sigact, NULL); >> >> - pthread_sigmask(SIG_BLOCK, NULL, &set); >> - sigdelset(&set, SIG_IPI); >> - pthread_sigmask(SIG_SETMASK, &set, NULL); >> + pthread_sigmask(SIG_BLOCK, NULL, &cpu->hvf->sigmask); >> + sigdelset(&cpu->hvf->sigmask, SIG_IPI); >> + pthread_sigmask(SIG_SETMASK, &cpu->hvf->sigmask, NULL); >> + >> + pthread_sigmask(SIG_BLOCK, NULL, &cpu->hvf->sigmask_ipi); >> + sigaddset(&cpu->hvf->sigmask_ipi, SIG_IPI); > There's no reason to unblock SIG_IPI while not in pselect and it can > easily lead to missed wakeups. The whole point of pselect is so that > you can guarantee that only one part of your program sees signals > without a possibility of them being missed. Hm, I think I start to agree with you here :). We can probably just leave SIG_IPI masked at all times and only unmask on pselect. The worst thing that will happen is a premature wakeup if we did get an IPI incoming while hvf->sleeping is set, but were either not running pselect() yet and bailed out or already finished pselect() execution. > >> #ifdef __aarch64__ >> r = hv_vcpu_create(&cpu->hvf->fd, (hv_vcpu_exit_t >> **)&cpu->hvf->exit, NULL); >> diff --git a/include/sysemu/hvf_int.h b/include/sysemu/hvf_int.h >> index c56baa3ae8..6e237f2db0 100644 >> --- a/include/sysemu/hvf_int.h >> +++ b/include/sysemu/hvf_int.h >> @@ -62,8 +62,9 @@ extern HVFState *hvf_state; >> struct hvf_vcpu_state { >> uint64_t fd; >> void *exit; >> - struct timespec ts; >> bool sleeping; >> + sigset_t sigmask; >> + sigset_t sigmask_ipi; >> }; >> >> void assert_hvf_ok(hv_return_t ret); >> diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c >> index 0c01a03725..350b845e6e 100644 >> --- a/target/arm/hvf/hvf.c >> +++ b/target/arm/hvf/hvf.c >> @@ -320,20 +320,24 @@ int hvf_arch_init_vcpu(CPUState *cpu) >> >> void hvf_kick_vcpu_thread(CPUState *cpu) >> { >> - if (cpu->hvf->sleeping) { >> - /* >> - * When sleeping, make sure we always send signals. Also, clear the >> - * timespec, so that an IPI that arrives between setting >> hvf->sleeping >> - * and the nanosleep syscall still aborts the sleep. >> - */ >> - cpu->thread_kicked = false; >> - cpu->hvf->ts = (struct timespec){ }; >> + if (qatomic_read(&cpu->hvf->sleeping)) { >> + /* When sleeping, send a signal to get out of pselect */ >> cpus_kick_thread(cpu); >> } else { >> hv_vcpus_exit(&cpu->hvf->fd, 1); >> } >> } >> >> +static void hvf_block_sig_ipi(CPUState *cpu) >> +{ >> + pthread_sigmask(SIG_SETMASK, &cpu->hvf->sigmask_ipi, NULL); >> +} >> + >> +static void hvf_unblock_sig_ipi(CPUState *cpu) >> +{ >> + pthread_sigmask(SIG_SETMASK, &cpu->hvf->sigmask, NULL); >> +} >> + >> static int hvf_inject_interrupts(CPUState *cpu) >> { >> if (cpu->interrupt_request & CPU_INTERRUPT_FIQ) { >> @@ -354,6 +358,7 @@ int hvf_vcpu_exec(CPUState *cpu) >> ARMCPU *arm_cpu = ARM_CPU(cpu); >> CPUARMState *env = &arm_cpu->env; >> hv_vcpu_exit_t *hvf_exit = cpu->hvf->exit; >> + const uint32_t irq_mask = CPU_INTERRUPT_HARD | CPU_INTERRUPT_FIQ; >> hv_return_t r; >> int ret = 0; >> >> @@ -491,8 +496,8 @@ int hvf_vcpu_exec(CPUState *cpu) >> break; >> } >> case EC_WFX_TRAP: >> - if (!(syndrome & WFX_IS_WFE) && !(cpu->interrupt_request & >> - (CPU_INTERRUPT_HARD | CPU_INTERRUPT_FIQ))) { >> + if (!(syndrome & WFX_IS_WFE) && >> + !(cpu->interrupt_request & irq_mask)) { >> uint64_t cval, ctl, val, diff, now; > I don't think the access to cpu->interrupt_request is safe because it > is done while not under the iothread lock. That's why to avoid these > types of issues I would prefer to hold the lock almost all of the > time. In this branch, that's not a problem yet. On stale values, we either don't sleep (which is ok), or we go into the sleep path, and reevaluate cpu->interrupt_request atomically again after setting hvf->sleeping. > >> /* Set up a local timer for vtimer if necessary ... */ >> @@ -515,9 +520,7 @@ int hvf_vcpu_exec(CPUState *cpu) >> >> if (diff < INT64_MAX) { >> uint64_t ns = diff * gt_cntfrq_period_ns(arm_cpu); >> - struct timespec *ts = &cpu->hvf->ts; >> - >> - *ts = (struct timespec){ >> + struct timespec ts = { >> .tv_sec = ns / NANOSECONDS_PER_SECOND, >> .tv_nsec = ns % NANOSECONDS_PER_SECOND, >> }; >> @@ -526,27 +529,31 @@ int hvf_vcpu_exec(CPUState *cpu) >> * Waking up easily takes 1ms, don't go to sleep >> for smaller >> * time periods than 2ms. >> */ >> - if (!ts->tv_sec && (ts->tv_nsec < (SCALE_MS * 2))) { >> + if (!ts.tv_sec && (ts.tv_nsec < (SCALE_MS * 2))) { >> advance_pc = true; >> break; >> } >> >> + /* block SIG_IPI for the sleep */ >> + hvf_block_sig_ipi(cpu); >> + cpu->thread_kicked = false; >> + >> /* Set cpu->hvf->sleeping so that we get a SIG_IPI >> signal. */ >> - cpu->hvf->sleeping = true; >> - smp_mb(); >> + qatomic_set(&cpu->hvf->sleeping, true); > This doesn't protect against races because another thread could call > kvf_vcpu_kick_thread() at any time between when we return from > hv_vcpu_run() and when we set sleeping = true and we would miss the > wakeup (due to kvf_vcpu_kick_thread() seeing sleeping = false and > calling hv_vcpus_exit() instead of pthread_kill()). I don't think it > can be fixed by setting sleeping to true earlier either because no > matter how early you move it, there will always be a window where we > are going to pselect() but sleeping is false, resulting in a missed > wakeup. I don't follow. If anyone was sending us an IPI, it's because they want to notify us about an update to cpu->interrupt_request, right? In that case, the atomic read of that field below will catch it and bail out of the sleep sequence. > > Peter > >> - /* Bail out if we received an IRQ meanwhile */ >> - if (cpu->thread_kicked || (cpu->interrupt_request & >> - (CPU_INTERRUPT_HARD | CPU_INTERRUPT_FIQ))) { >> - cpu->hvf->sleeping = false; >> + /* Bail out if we received a kick meanwhile */ >> + if (qatomic_read(&cpu->interrupt_request) & irq_mask) { >> + qatomic_set(&cpu->hvf->sleeping, false); ^^^ Alex >> + hvf_unblock_sig_ipi(cpu); >> break; >> } >> >> - /* nanosleep returns on signal, so we wake up on >> kick. */ >> - nanosleep(ts, NULL); >> + /* pselect returns on kick signal and consumes it */ >> + pselect(0, 0, 0, 0, &ts, &cpu->hvf->sigmask); >> >> /* Out of sleep - either naturally or because of a >> kick */ >> - cpu->hvf->sleeping = false; >> + qatomic_set(&cpu->hvf->sleeping, false); >> + hvf_unblock_sig_ipi(cpu); >> } >> >> advance_pc = true; >>