From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CB3E2C2D0EA for ; Thu, 9 Apr 2020 09:03:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7223920857 for ; Thu, 9 Apr 2020 09:03:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="IVNcAyuf" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726582AbgDIJDz (ORCPT ); Thu, 9 Apr 2020 05:03:55 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:57248 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725828AbgDIJDz (ORCPT ); Thu, 9 Apr 2020 05:03:55 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1586423034; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8kQ7lBIYRxVIllP8WU3yr1RLlp3Ki4sY9tIMcm8NLXo=; b=IVNcAyufctBLzc53jkAutcUvY2tu2GD0XxJwmWuZEKz1ujU+Ius/XKI0SpLX5nKI+6+Tc1 P0fTsVR9sCSfLPG0ffySnfeA5mIW3sen/T3iDzPl23Ntl7omZ1BFx4jPHoVtIXFk4b+cLe b11CIjVlPGb+nWxHIyHYl+pdHgqnqRI= Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com [209.85.221.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-162-wD_fSL0HPfOsZ9Fg-KveGg-1; Thu, 09 Apr 2020 05:03:53 -0400 X-MC-Unique: wD_fSL0HPfOsZ9Fg-KveGg-1 Received: by mail-wr1-f72.google.com with SMTP id d4so6027676wrq.10 for ; Thu, 09 Apr 2020 02:03:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=8kQ7lBIYRxVIllP8WU3yr1RLlp3Ki4sY9tIMcm8NLXo=; b=scr1P4q+Ka5yVgsHGzJsoQyDQU5drmcS+yvq3ifsuipgtDuA2Wnfi0cyXb8eU6YFrE 1YLYhW9U+QvzQOX5f2lctCiFNxPDrmM8BxEIzFPmC0QUGTSB7n8YgjBlhLvgVc2o2uKA aYTAg0yG/f8pTW9alKxIoFPBu8IGiGBZWC82yuLsZd2jVVptUp861pdqQX93N61v6lSX jhsGpCSp+uYG0xS2zTLylqSLALqUVOL9uYjnVhGHG154+AwsIqql9GY3LyC31XPTZUBm PlHYuJzmcpaJLGshHRv11x4mpzmVwv0hmaYEN+86S09QI6YC7j5TLSrhdMd0w0TP9u/i N+Pg== X-Gm-Message-State: AGi0PuapaqLFy26OJO4qaxP8PIB23I3qbgkfKhohFC+Y8p6bXdE0kKw+ FKrPvM1GMGUQMT2DASrRpBrPBwSuuOYnS+kbCPEzT+Z4/GsrMNqT6rcRZygFwrd25jjoZFrESjk lyOnC6HOQzmkb X-Received: by 2002:adf:b1c6:: with SMTP id r6mr12632481wra.49.1586423032030; Thu, 09 Apr 2020 02:03:52 -0700 (PDT) X-Google-Smtp-Source: APiQypLF16jQY0MQeLOoeWEjDvj99XBLM/QjgSY70WD35IL8kqp+4bpmhm1qfLNT2fbzdEIEC3sYEw== X-Received: by 2002:adf:b1c6:: with SMTP id r6mr12632450wra.49.1586423031690; Thu, 09 Apr 2020 02:03:51 -0700 (PDT) Received: from ?IPv6:2001:b07:6468:f312:bddb:697c:bea8:abc? ([2001:b07:6468:f312:bddb:697c:bea8:abc]) by smtp.gmail.com with ESMTPSA id t8sm300999wrq.88.2020.04.09.02.03.50 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 09 Apr 2020 02:03:51 -0700 (PDT) Subject: Re: [PATCH v2] x86/kvm: Disable KVM_ASYNC_PF_SEND_ALWAYS To: Thomas Gleixner , Andy Lutomirski , Vivek Goyal Cc: Peter Zijlstra , Andy Lutomirski , LKML , X86 ML , kvm list , stable References: <20200407172140.GB64635@redhat.com> <772A564B-3268-49F4-9AEA-CDA648F6131F@amacapital.net> <87eeszjbe6.fsf@nanos.tec.linutronix.de> <874ktukhku.fsf@nanos.tec.linutronix.de> <274f3d14-08ac-e5cc-0b23-e6e0274796c8@redhat.com> <87pncib06x.fsf@nanos.tec.linutronix.de> From: Paolo Bonzini Message-ID: <92ea7036-0b77-20da-34ac-f425e6f233c2@redhat.com> Date: Thu, 9 Apr 2020 11:03:50 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.5.0 MIME-Version: 1.0 In-Reply-To: <87pncib06x.fsf@nanos.tec.linutronix.de> Content-Type: text/plain; charset=windows-1252 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On 08/04/20 15:01, Thomas Gleixner wrote: > > And it comes with restrictions: > > The Do Other Stuff event can only be delivered when guest IF=1. > > If guest IF=0 then the host has to suspend the guest until the > situation is resolved. > > The 'Situation resolved' event must also wait for a guest IF=1 slot. Additionally: - the do other stuff event must be delivered to the same CPU that is causing the host-side page fault - the do other stuff event provides a token that identifies the cause and the situation resolved event provides a matching token This stuff is why I think the do other stuff event looks very much like a #VE. But I think we're in violent agreement after all. > If you just want to solve Viveks problem, then its good enough. I.e. the > file truncation turns the EPT entries into #VE convertible entries and > the guest #VE handler can figure it out. This one can be injected > directly by the hardware, i.e. you don't need a VMEXIT. > > If you want the opportunistic do other stuff mechanism, then #VE has > exactly the same problems as the existing async "PF". It's not magicaly > making that go away. You can inject #VE from the hypervisor too, with PV magic to distinguish the two. However that's not necessarily a good idea because it makes it harder to switch to hardware delivery in the future. > One possible solution might be to make all recoverable EPT entries > convertible and let the HW inject #VE for those. > > So the #VE handler in the guest would have to do: > > if (!recoverable()) { > if (user_mode) > send_signal(); > else if (!fixup_exception()) > die_hard(); > goto done; > } > > store_ve_info_in_pv_page(); > > if (!user_mode(regs) || !preemptible()) { > hypercall_resolve_ept(can_continue = false); > } else { > init_completion(); > hypercall_resolve_ept(can_continue = true); > wait_for_completion(); > } > > or something like that. Yes, pretty much. The VE info can also be passed down to the hypercall as arguments. Paolo > The hypercall to resolve the EPT fail on the host acts on the > can_continue argument. > > If false, it suspends the guest vCPU and only returns when done. > > If true it kicks the resolve process and returns to the guest which > suspends the task and tries to do something else. > > The wakeup side needs to be a regular interrupt and cannot go through > #VE.