From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3522FC001B4 for ; Thu, 1 Apr 2021 17:45:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 62092613FE for ; Thu, 1 Apr 2021 17:45:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236497AbhDARpP (ORCPT ); Thu, 1 Apr 2021 13:45:15 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:20372 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234607AbhDARiW (ORCPT ); Thu, 1 Apr 2021 13:38:22 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1617298702; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lOIr1o35ulSBWubdx/QPT+kMf1fQWz0YQbH/YEA+9Lw=; b=VV51kv1viqFx890/hiPrpHoEJ3Gu6Te1y7NNZFj+y9WPU5qdwkcsuBoe53tS9vNuJAwDaH o6Pw+E3cmLd2OQARCNAQMEo/h4cDw6pt+HwpGPuAHnowVGHNpu8pVg+iIFtETVSjEy50EL h1H8tKyVGywJtajhQX8C6qJllhh6Pg0= Received: from mail-ej1-f69.google.com (mail-ej1-f69.google.com [209.85.218.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-516-F_WRoUspPeORRXmyjKdCyw-1; Thu, 01 Apr 2021 13:32:10 -0400 X-MC-Unique: F_WRoUspPeORRXmyjKdCyw-1 Received: by mail-ej1-f69.google.com with SMTP id jo6so2521520ejb.13 for ; Thu, 01 Apr 2021 10:32:10 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:cc:references:from:subject:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=lOIr1o35ulSBWubdx/QPT+kMf1fQWz0YQbH/YEA+9Lw=; b=m8RGKMscJRnmh4JRED2YXPw4pujDX4R4tHF1wOnzHgdRxGG4x9pxTKWjOYj0FkTQlW 6n8dzhYfXhHbYutM1IJ3eI0qyc27GK7poTekGZBQBPOycctU+2PrAXNUy4br+ONu6rtU USvQYyVCAaK+DhFbr6k2xApDpmCuUeoM5TsmRjVWmp+IXKMMbsOuJmhRvslS6Er/9yUe psYd5jFJI7MNjyGpvZgrxyzjAqsW22ygd1C+EoqQhPSak+u7Y48kT35nM7mYURgkB1H8 b7R1RMxFXBmzhJ0XdUgeSmUaz+qEyrSLb8tdzngF2FLLwLUVPSujvZH6ijXOU/j7+9kQ 0ACw== X-Gm-Message-State: AOAM533jUlgbvM3nsrptUQm8sv9rkUDCaMnOK5phLh8HgH03YC2MTGnJ IEEpN0XSC+s9x2AAiiEuWyCsDcPOJDS6P6gzwfvNFCOgrdQIf+yk65StdW4WnNeyB2YVHlu8W9V uYOH0SIH1oOWgs9fRE1pW/J2R X-Received: by 2002:a17:906:4705:: with SMTP id y5mr10388404ejq.119.1617298329202; Thu, 01 Apr 2021 10:32:09 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxLAa9FWEdvYEX4U6/FyjsHeW/LVAdCKVbvQp8KG1+EKu1oB4EIcgUYT6s9DDMcUswckmd8iw== X-Received: by 2002:a17:906:4705:: with SMTP id y5mr10388383ejq.119.1617298329041; Thu, 01 Apr 2021 10:32:09 -0700 (PDT) Received: from ?IPv6:2001:b07:6468:f312:c8dd:75d4:99ab:290a? ([2001:b07:6468:f312:c8dd:75d4:99ab:290a]) by smtp.gmail.com with ESMTPSA id c20sm3056119eja.22.2021.04.01.10.32.07 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 01 Apr 2021 10:32:08 -0700 (PDT) To: Ben Gardon Cc: LKML , kvm , Peter Xu , Sean Christopherson , Peter Shier , Peter Feiner , Junaid Shahid , Jim Mattson , Yulei Zhang , Wanpeng Li , Vitaly Kuznetsov , Xiao Guangrong References: <20210202185734.1680553-1-bgardon@google.com> <20210202185734.1680553-21-bgardon@google.com> From: Paolo Bonzini Subject: Re: [PATCH v2 20/28] KVM: x86/mmu: Use atomic ops to set SPTEs in TDP MMU map Message-ID: Date: Thu, 1 Apr 2021 19:32:07 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/04/21 18:50, Ben Gardon wrote: >> retry: >> if (is_shadow_present_pte(iter.old_spte)) { >> if (is_large_pte(iter.old_spte)) { >> if (!tdp_mmu_zap_spte_atomic(vcpu->kvm, &iter)) >> break; >> >> /* >> * The iter must explicitly re-read the SPTE because >> * the atomic cmpxchg failed. >> */ >> iter.old_spte = READ_ONCE(*rcu_dereference(iter.sptep)); >> goto retry; >> } >> } else { >> ... >> } >> >> ? > To be honest, that feels less readable to me. For me retry implies > that we failed to make progress and need to repeat an operation, but > the reality is that we did make progress and there are just multiple > steps to replace the large SPTE with a child PT. You're right, it's makes no sense---I misremembered the direction of tdp_mmu_zap_spte_atomic's return value. I was actually thinking of this: > Another option which could improve readability and performance would > be to use the retry to repeat failed cmpxchgs instead of breaking out > of the loop. Then we could avoid retrying the page fault each time a > cmpxchg failed, which may happen a lot as vCPUs allocate intermediate > page tables on boot. (Probably less common for leaf entries, but > possibly useful there too.) which would be retry: if (is_shadow_present_pte(iter.old_spte)) { if (is_large_pte(iter.old_spte) && !tdp_mmu_zap_spte_atomic(vcpu->kvm, &iter)) { /* * The iter must explicitly re-read the SPTE because * the atomic cmpxchg failed. */ iter.old_spte = READ_ONCE(*rcu_dereference(iter.sptep)); goto retry; } /* XXX move this to tdp_mmu_zap_spte_atomic? */ iter.old_spte = 0; } else { continue; } } sp = alloc_tdp_mmu_page(vcpu, iter.gfn, iter.level); child_pt = sp->spt; new_spte = make_nonleaf_spte(child_pt, !shadow_accessed_mask); if (!tdp_mmu_set_spte_atomic(vcpu->kvm, &iter, new_spte)) { tdp_mmu_free_sp(sp); /* * The iter must explicitly re-read the SPTE because * the atomic cmpxchg failed. */ iter.old_spte = READ_ONCE(*rcu_dereference(iter.sptep)); goto retry; } tdp_mmu_link_page(vcpu->kvm, sp, true, huge_page_disallowed && req_level >= iter.level); trace_kvm_mmu_get_page(sp, true); which survives at least a quick smoke test of booting a 20-vCPU Windows guest. If you agree I'll turn this into an actual patch.