From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88148C433E0 for ; Thu, 6 Aug 2020 00:37:47 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 557752245C for ; Thu, 6 Aug 2020 00:37:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="aaiyM/WR" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 557752245C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1k3TuQ-0008TC-Hj; Thu, 06 Aug 2020 00:37:26 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1k3TuP-0008T7-Bx for xen-devel@lists.xenproject.org; Thu, 06 Aug 2020 00:37:25 +0000 X-Inumbo-ID: dd9a61fe-4e98-43f0-b7a4-06b57264fcd9 Received: from mail.kernel.org (unknown [198.145.29.99]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id dd9a61fe-4e98-43f0-b7a4-06b57264fcd9; Thu, 06 Aug 2020 00:37:23 +0000 (UTC) Received: from localhost (c-67-164-102-47.hsd1.ca.comcast.net [67.164.102.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id ED96C20842; Thu, 6 Aug 2020 00:37:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1596674242; bh=IvvjNayDWHWpV02+RT2Wm7bTN+tg82sFIQ4J452VtMY=; h=Date:From:To:cc:Subject:In-Reply-To:References:From; b=aaiyM/WRGotg9LJl8yqafFSo0YLqLJ0ThXd47pNSk8X3LbX2Tj3qVtjLEBQFzCErX U2KezIWoaOYIlbBLOI/af9/lYp5H+2hGkT8OKiFkF2TaH0uFHvUzb565ptS0j0n7N0 i+c1GHoOp/3RKJ6TuxWI0Iu6J+4DfMKOYTSL9PNs= Date: Wed, 5 Aug 2020 17:37:21 -0700 (PDT) From: Stefano Stabellini X-X-Sender: sstabellini@sstabellini-ThinkPad-T480s To: Jan Beulich Subject: Re: [RFC PATCH V1 01/12] hvm/ioreq: Make x86's IOREQ feature common In-Reply-To: <2ab4c567-8efa-1b9d-ab00-4ea7e1ab323e@suse.com> Message-ID: References: <1596478888-23030-1-git-send-email-olekstysh@gmail.com> <1596478888-23030-2-git-send-email-olekstysh@gmail.com> <000c01d66a33$2bd56510$83802f30$@xen.org> <9f83a7ed-ca97-449f-c7b9-a1140644abe9@gmail.com> <2ab4c567-8efa-1b9d-ab00-4ea7e1ab323e@suse.com> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: 'Kevin Tian' , Stefano Stabellini , Julien Grall , 'Wei Liu' , paul@xen.org, 'Andrew Cooper' , 'Ian Jackson' , 'George Dunlap' , 'Tim Deegan' , Oleksandr , 'Oleksandr Tyshchenko' , 'Julien Grall' , 'Jun Nakajima' , xen-devel@lists.xenproject.org, =?UTF-8?Q?'Roger_Pau_Monn=C3=A9'?= Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" On Wed, 5 Aug 2020, Jan Beulich wrote: > On 04.08.2020 21:11, Stefano Stabellini wrote: > >> The point of the check isn't to determine whether to wait, but > >> what to do after having waited. Reads need a retry round through > >> the emulator (to store the result in the designated place), > >> while writes don't have such a requirement (and hence guest > >> execution can continue immediately in the general case). > > > > The x86 code looks like this: > > > > rc = hvm_send_ioreq(s, &p, 0); > > if ( rc != X86EMUL_RETRY || currd->is_shutting_down ) > > vio->io_req.state = STATE_IOREQ_NONE; > > else if ( !hvm_ioreq_needs_completion(&vio->io_req) ) > > rc = X86EMUL_OKAY; > > > > Basically hvm_send_ioreq is expected to return RETRY. > > Then, if it is a PIO write operation only, it is turned into OKAY right > > away. Otherwise, rc stays as RETRY. > > > > So, normally, hvmemul_do_io is expected to return RETRY, because the > > emulator is not done yet. Am I understanding the code correctly? > > "The emulator" unfortunately is ambiguous here: Do you mean qemu > (or whichever else ioreq server) or the x86 emulator inside Xen? I meant QEMU. I'll use "QEMU" instead of "emulator" in this thread going forward for clarity. > There are various conditions leading to RETRY. As far as > hvm_send_ioreq() goes, it is expected to return RETRY whenever > some sort of response is to be expected (the most notable > exception being the hvm_send_buffered_ioreq() path), or when > submitting the request isn't possible in the first place. > > > If so, who is handling RETRY on x86? It tried to follow the call chain > > but ended up in the x86 emulator and got lost :-) > > Not sure I understand the question correctly, but I'll try an > answer nevertheless: hvm_send_ioreq() arranges for the vCPU to be > put to sleep (prepare_wait_on_xen_event_channel()). Once the event > channel got signaled (and vCPU unblocked), hvm_do_resume() -> > handle_hvm_io_completion() -> hvm_wait_for_io() then check whether > the wait reason has been satisfied (wait_on_xen_event_channel()), > and ... > > > At some point later, after the emulator (QEMU) has completed the > > request, handle_hvm_io_completion gets called which ends up calling > > handle_mmio() finishing the job on the Xen side too. > > ..., as you say, handle_hvm_io_completion() invokes the retry of > the original operation (handle_mmio() or handle_pio() in > particular) if need be. OK, thanks for the details. My interpretation seems to be correct. In which case, it looks like xen/arch/arm/io.c:try_fwd_ioserv should return IO_RETRY. Then, xen/arch/arm/traps.c:do_trap_stage2_abort_guest also needs to handle try_handle_mmio returning IO_RETRY the first around, and IO_HANDLED when after QEMU does its job. What should do_trap_stage2_abort_guest do on IO_RETRY? Simply return early and let the scheduler do its job? Something like: enum io_state state = try_handle_mmio(regs, hsr, gpa); switch ( state ) { case IO_ABORT: goto inject_abt; case IO_HANDLED: advance_pc(regs, hsr); return; case IO_RETRY: /* finish later */ return; case IO_UNHANDLED: /* IO unhandled, try another way to handle it. */ break; default: ASSERT_UNREACHABLE(); } Then, xen/arch/arm/ioreq.c:handle_mmio() gets called by handle_hvm_io_completion() after QEMU completes the emulation. Today, handle_mmio just sets the user register with the read value. But it would be better if it called again the original function do_trap_stage2_abort_guest to actually retry the original operation. This time do_trap_stage2_abort_guest calls try_handle_mmio() and gets IO_HANDLED instead of IO_RETRY, thus, it will advance_pc (the program counter) completing the handling of this instruction. The user register with the read value could be set by try_handle_mmio if try_fwd_ioserv returns IO_HANDLED instead of IO_RETRY. Is that how the state machine is expected to work? > What's potentially confusing is that there's a second form of > retry, invoked by the x86 insn emulator itself when it needs to > split complex insns (the repeated string insns being the most > important example). This results in actually exiting back to guest > context without having advanced rIP, but after having updated > other register state suitably (to express the progress made so > far). Ah! And it seems to be exactly the same label: X86EMUL_RETRY. It would be a good idea to differentiate between them.